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Resistivity of Bulk Silicon and of 
Diffused Layers in Silicon 


By JOHN C. IRVIN 
(Manuscript received July 25, 1961) 


Measurements of resistivity and impurity concentration in heavily doped 
silicon are reported. These and previously published data are incorporated 
tna graph showing the resistivity (at T = 300°K) of n- and p-type silicon 
as a function of donor or acceptor concentration. 

The relationship between surface concentration and average conductivity 
of diffused layers in silicon has been calculated for Gaussian and comple- 
mentary error function distributions. The results are shown graphically. 
Similar calculations for subsurface layers, such as a transistor base region, 
are also given. 


I. INTRODUCTION 


A diffused layer in silicon is generally characterized by four parame- 
ters: the concentration, C, , of diffused donors or acceptors at the surface, 
the concentration, Cz , of acceptors or donors originally in the material 
(background concentration), the depth, x;, of the resultant junction, 
and the sheet resistivity, p, , of the layer. A knowledge of the relationship 
between these parameters is essential to the establishment of device 
processing recipes, the evaluation of diffusion techniques, and investiga- 
tions of the thermodynamic properties of silicon. 

The desired relationship may be readily calculated, given a knowledge 
of the distribution of the diffused impurities, the variation of the re- 
sistivity of n- and p-type silicon with donor or acceptor density, and a 
fast electronic computer. The results of such a computation were first 
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made generally available three years ago, in the form of curves relating 
C, to 1/p,x; for a given Cz, for n- and for p-type layers in silicon, and 
for several common distributions.’ Recent calculations, however, based 
on new and more extensive silicon resistivity data, have indicated con- 
siderable error in the earlier results. Thus a comprehensive recomputa- 
tion has been undertaken, the outcome of which is presented herewith. 

A necessary adjunct to the calculation is an accurate knowledge of the 
resistivity of n- and p-type silicon with varying dopant concentration. 
To this end, most of the extant data have been reviewed and supplemented 
here and there with some new determinations. The results of this search 
are also presented here. 


II. THE RESISTIVITY OF SILICON AS A FUNCTION OF IMPURITY CONCEN- 
TRATION 


The variation of the resistivity of silicon at 800°K as a function of the 
concentration of acceptors or donors is shown in Tig. 1. This graph 
represents the author’s judgment of a most reasonable compromise to 
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IMPURITY CONCENTRATION (CM73) 


Fig. 1 — Resistivity of silicon at 300°K as a function of acceptor or donor con- 
centration. 
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TasLe I — REsISTIVITIES AND IMPURITY CONCENTRATIONS 
IN Srticon (T = 300°K) 














Resistivit: : : . 
ieee Impurity Goacntetion Gas) Gis (cm73) 
0.00076 B 1.66 X 102° 

0.00089 B 1.41 & 1029 

0.0010 B 1.49 & 10% 
0.0010 B 1.12 X 107° 

0.0012 B 1.04 * 107° 

0.0011 B 1.12 & 107° 

0.0014 B 9.23 X 1019 

0.0013 B 8.84 X 1019 

0.0067 B 1.438 < 1019 

0.0073 B 1.48 X 1019 

0.013 B 7.41 X 1038 

0.014 B 7.03 X 10% 

0.00095 As 1.80 X 10° 

0.00094 As 1.86 X 107° 

0.00094 As 1.1 X 10? 
0.00093 As 1.87 X 107° 

0.00094 As 1.97 X 107° 

0.00088 As 2.10 X 107° 

0.00088 As 2.19 X 107° 

0.00089 As 1.1 X 10% 
0.00083 As 2.30 X 107° 

0.00083 As 2.20 X 102° 

0.00080 As 2.46 X 107° 

0.00082 As 2.44 X 107° 


the mass of available and not altogether compatible data on the subject. 
These data include most of the previously published work (Refs. 3-12), 
recent, unpublished results kindly provided by other investigators,””” 
as well as some measurements obtained expressly for the present study. 

The last data are shown in Table I. The crystals involved were pulled 
from quartz crucibles, and hence can not be expected to be particularly: 
low in oxygen content. After dissolution of the boron-doped crystals 
and separation of the dopant,'* boron concentrations were determined by 
a photometric carmine technique essentially similar to published meth- 
ods.’” Arsenic concentrations were measured by gamma-ray spectrometry 
after pile neutron activation. Resistivity measurements were done with 
a four-point probe. In the case of a few samples, resistivity and carrier 
concentration were measured in Hall-effect apparatus (where it was 
assumed py/p = 1). 

Drawing curves through these many points was accomplished by a 
succession of smoothing procedures, which were primarily visual. 75 per 
cent of the data points deviate less than 10 per cent from the curves thus 
obtained, both for the p-type and the n-type cases. The uncertainty is 
greatest in the degenerate region. For p-type silicon, suitable data be- 
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come scarce at dopings greater than 10” cm™’, and none are available 
beyond 3 X 10” cm”. For n-type material, there is an abundance of 
rather conflicting data representing donor concentrations between 10” 
cm and 6 X 10° cm’. In this region a 10 per cent variation in the 
chosen line still includes 67 per cent of the data, however. 

A single pair of curves obviously can not characterize with the same 
degree of accuracy all silicon material, regardless of dopant employed 
or degree of compensation. However, over the range 10“ em” < N; S 
10” em’, and subject to the limitations discussed below, Fig. 1 is con- 
sidered to be within 10 per cent of reality. This graph refers specifically 
to uncompensated silicon containing a donor or acceptor impurity con- 
centration, Nz, consisting of arsenic, phosphorus, or antimony for 
n-type, and aluminum, boron, or gallium for p-type material. (Actually, 
even among samples doped with the aforementioned impurities, small 
but consistent differences in carrier concentration and mobility, depend- 
ing on the specific choice of donor or of acceptor, have been reported 
recently for silicon in the 0.001 ohm-cm region.’”’) In case of moderate 
compensation, the net impurity density, |N4 — Np |, should be used 
for Nz . However, heavy compensation requires allowance for the added 
impurity scattering. 

For impurity densities near or greater than 10° em’, Fig. 1 can not 
be considered very reliable. At such concentrations, impurity band 
conduction is prominent and its effects are apt to differ appreciably 
depending on choice of impurity. Even more serious are the degrees of 
impurity precipitation and lattice imperfection which occur in highly 
doped material and which furthermore vary with growth conditions 
and history of the crystal. It will be noted with some consternation that 
the p-type and n-type curves are shown to cross near N; = 3 X 10° em™*. 
The paucity of data, of course, casts considerable doubt on this result. 
However, for what they are worth, such are the indications. Perhaps this 
can be understood in light of the acceptor action of imperfections, 
especially vacancies, which are abundant in very highly doped material. 

The calculations discussed in the remainder of this paper require a 
mathematical representation of Fig. 1. Straight-line approximations of 
the form (1/p) = BN;,* have been obtained, which depart 10 per cent 
from the desired curve at the turning points and rapidly approach 
coincidence elsewhere. The parameters B and a are listed in Table II 
for the respective straight-line regions. 


III. DIFFUSION PROFILES AND CALCULATIONS 


The diffusion profiles of current practical interest are the comple- 
mentary error function, C, = C, erfe (x/2+/Dt), and the Gaussian, 
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TaBLeE IJ] — VaALuss or B AND @ IN THE Equation (1/p) = BN;*, 
REPRESENTING STRAIGHT-LINE APPROXIMATIONS TO THE p VS 
N, CURVES OF N-TYPE AND p-TYPE Sinicon (T = 300°K) 








Region (cm73) B a 

n-type 
2.35 X 109° S Np 1.04 X 1075 0.456 
6.00 * 10! < Np S 2.35 X 10? 1.43 X 1072 0.744 
9.50 X 1042 < Np S 6.00 X 101 2.00 X 10715 0.940 
1.00 X 10% S$ Np S 9.50 X 1018 6.93 X 10-9 0.543 
3.50 X 10'°§ S Np S 1.00 X 10" 6.97 XK 10-4 0.837 
Np S 3.50 X 1045 2.00 X 10716 1.000 

p-type 
1.50 X 109 S Ny 4.00 X 107" 0.966 
2.40 X 108 < Ny, S$ 1.50 X 10! 1.47 X 10714 0.832 
1.50 X 10'° S Ny S 2.40 X 1038 3.30 X 107! 0.650 
Na = 1.50 X 1018 7.20 X 10-7 1.000 





C, = C, exp (—2’/4Dt). In these expressions, x, D, and ¢ are the depth, 
diffusion coefficient (assumed independent of impurity density), and 
time, respectively. C; is the concentration of the diffused impurity at 
depth x and C, , that at the surface. The former distribution is expected 
when diffusion takes place with the surface concentration C’, held con- 
stant; the latter when the total impurity diffusing is constant. Unfor- 
tunately it must be admitted that the accuracy of these expectations is 
open to question in some situations.”° Also, precipitation and compen- 
sation of impurities near the surface may further distort the distribution. 
However, it is still useful to solve the problem under these assumptions, 
leaving corrections for later determination. 

The “average conductivity” of a diffused layer (which throughout 
this paper is assumed to be diffused into a silicon slice of opposite con- 
ductivity type and uniform doping Cz) is given by the expression 


pS Aye i One 


where q is electronic charge, » the carrier mobility typical of a total 
ionized impurity density of C, + Cz ,C = r(C, — Cz) is the density of 
carriers, r being the fraction of uncompensated diffused impurity atoms 
which are ionized, and C, the total density of diffused impurity atoms at 
depth x. (Possible variation of the mobility as a function of the proxim- 
ity of the surface is a hazard which should be recognized in passing but 
is otherwise ignored in the present calculation.) Multiplying and dividing 
within the integrand by r’ (C, + Cp), where r’ is the ionized fraction 
associated with an uncompensated dopant density of (Cz + Cs), and 
writing 
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qur’(C, + Cs) = overyen) = B(Cz + Cz)" 


the average conductivity becomes 
é = (1/2;) | (r/r’)(Cz — Cz) B(C, + Ce)*” de. 


Now (r/r’) represents the ratio of degrees of ionization corresponding 
to C, — Cz and C, + Cz respectively. This ratio is very nearly unity 
unless C, and Cz are comparable in magnitude. Such is the case only for 
the lamina nearest the junction, which contributes negligibly to the 
conductance of the whole layer. Hence, (r/r’) may be justifiably taken 
as equal to unity, and writing C, = Csf(x), where f(a) depends on the 
profile of interest, 


é = (1/x) [ Cf y= CABICHG) Gal de: 


A program for the evaluation of this expression has been devised 
previously by others and employed in the analysis of diffused layers in 
germanium.” With slight additions to facilitate automatic plotting, the 
same program has been used in the present work. Computations were 
performed on an IBM 704, and plotting of points was carried out with 
an Electronic Associates Variplotter. 


IV. PRESENTATION OF RESULTS 


Of frequent interest in transistor design and in the analysis of diffused 
layers, are the characteristics of a ‘subsurface’ layer such as illustrated 
in Fig. 2. This layer, bounded on one side by the junction and on the 
other by a plane paralleling the junction at depth x, may be characterized 
by an average conductivity 


& = 1/los! (0; ~ 2)] = | qu ae 


where p,’ is the sheet resistance of the subsurface layer. It will be recog- 
nized that the base region of a diffused-base, alloyed-emitter transistor 
is an example of a subsurface layer. Another example is that portion of 
a diffused layer remaining after-removing the top strata of depth z. 
Here, however, it must be remembered that the value of C, specifying 
this layer pertains to the original surface at x = 0. 

Since a subsurface layer becomes the entire diffused layer when x = 0, 
it is convenient to display the properties of both in the same plot by 
introducing the parameter (2/z;). On pages 394 to 410 such graphs are 
presented for n- and p-type diffused layers of Gaussian and comple- 
mentary error function profile. Each graph contains the family of ten 
curves (z/x;) = 0, 0.1,---,0.9, and relates the average conductivity of 


RESISTIVITY IN SILICON 393 





Cs 








wy 


IMPURITY CONCENTRATION => 


x xj 
DEPTH => 


Fig. 2 — Profile of a diffused layer with subsurface layer shaded. 


each layer to the surface concentration (at the original surface) for a 
given value of Cz. A separate graph is required for each value of Cz , 
which in the present work ranges from 10'* em™ to 10” em™ at one- 
decade intervals. In each plot the range of surface concentrations 
spanned is from C's to 10°" em’. The so-called ““Backenstoss” curve for 
a particular Cz is simply the right-most line (2/2; = 0) in each graph. 

The wiggle in the n-type average conductivity for diffusant concentra- 
tions near 10” cm‘ is ascribable to the rather large change in slope oc- 
curring in the n-type resistivity plot at N; = 10” cm”. 
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Fig. 3 — Average conductivity of n-type complementary error function layers 


in silicon. 
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Fig. 3 (cont.) — Average conductivity of n-type complementary error function 
layers in silicon. 
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AVERAGE CONDUCTIVITY, & = 1/ [P's (xj-20)] (OHM-cM)"! 


Fig. 3 (cont.) — Average conductivity of n-type complementary error function 
layers in silicon. 
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Fig. 4 — Average conductivity of n-type Gaussian layers in silicon. 
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Fig. 4 (cont.) — Average conductivity of n-type Gaussian layers in silicon. 
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A Miniature Tuned Reed Selector of High 
Sensitivity and Stability 


By L. G. BOSTWICK 
(Manuscript received August 23, 1961) 


This paper describes a selective contacting device that 1s responsive only 
to sustained frequencies in a discrete narrow band and 1s insensitive to 
speech and noise interference. It is of small size suitable for use in a pocket- 
carried radio receiver and 1s sufficiently stable to permit 33 discrete res- 
onant frequencies, spaced 15 cycles apart, in less than an octave between 
617.5 and 997.5 cycles per second. It has a threshold sensitivity of about 35 
microwatts and other operating characteristics that are essential in large 
capacity systems. 


I, INTRODUCTION 


Tuned reed selectors used as selective receivers in multifrequency 
systems involving large numbers of individual selections, such as per- 
sonal radio signaling,! must operate within close and specifiable limits 
in order to avoid false signaling and to assure satisfactory performance 
under devious environmental and circuit conditions. In particular, three 
operating characteristics, or their equivalents, must be controlled, 
namely: the resonant frequency, the sensitivity (current or power needed 
at the most sensitive frequency), and the bandwidth (the frequency 
band in which contacting occurs with an input power twice that needed 
at the most sensitive frequency). 

The permissible variation in these characteristics is much smaller than 
would seem necessary from first considerations. Resonant frequency 
changes that seem negligible compared to the frequency spacing between 
adjacent selectors often become important when other system require- 
ments are considered simultaneously. lor example, the frequency range 
over which contacting will occur depends upon the electrical input level 
and the selector bandwidth. Consequently, feasible limits for both of 
these latter quantities must be considered, and in determining allowable 
frequency deviations from nominal, the lowest probable input level and 
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the narrowest bandwidth must be taken into account. On the other 
hand, excessively high input levels cannot be allowed even in those 
unusual instances where conserving power is unimportant, because this 
necessitates wider channel separations in order to avoid transient opera- 
tion of adjacent selectors, particularly those having high sensitivities. 
Furthermore, high input levels result in longer decay times, which often 
cannot be tolerated. When these and other related factors are considered 
and the widest manufacturing tolerances are sought, it is found that the 
above three selector characteristics are closely interrelated, and one 
cannot be relaxed without making one or both of the others more 
stringent. 

The tuned reed selectors described in this paper have factory adjust- 
ment provisions and sufficient structural stability to control in a practical 
manner the resonant frequencies, the sensitivities and the bandwidths 
within adequate and compatible limits. As a result, it is feasible to use 
33 discrete resonant frequencies, 15 cycles apart, in less than an octave 
between 517.5 and 997.5 cycles. An available electrical power of 35 
microwatts at each individual resonant frequency will just operate the 
contact, and a power of 100 microwatts will close the contact to a low 
resistance over 20 per cent or more of the reed period. These and other 
capabilities to be described distinguish these selectors from many others 
that are not adequate for reliable operation in large systems. 


II. GENERAL DESCRIPTION 


Fig. 1 is a photograph showing one complete reed selector with the 
outside shell removed. Fig. 2 is a partially exploded view showing 
the subassemblies and indicating how the parts are fitted together. 
The shell is formed from permalloy sheet; it serves as an effective shield 
from extraneous fields and as a high-permeability flux path for the 
internal magnetic circuit. All parts are electrically insulated from the 
shell. The complete selector weighs about 8 grams. 

As shown in these photographs, a tuning fork formed from two reeds 
brazed to a base block serves as the resonant element. This balanced 
type of structure does not require a massive support as would a single 
cantilever reed in order to isolate it from extraneous influences, an im- 
portant matter for a miniature device. This fork is freely supported 
within the shell by a compliant frame that further isolates any small 
residual vibration of the fork base from the rest of the selector, and yet 
is sufficiently stable to permit the vibrating contact on the end of the 
tuning-fork tine to be precisely positioned with respect to the stationary 
contact. This latter contact is carried by a loop of wire spot-welded to a 
rotatable stud that fits into a tapered hole in an insulating bushing in 
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Fig. 1 — Tuned reed selector with shell removed. 


the frame between the tines. A magnetic polepiece is positioned between 
the open ends of the tines, forming two equal gaps. Polarizing magnetic 
flux is set up in these gaps by a small permanent magnet attached to 
the opposite end of the polepiece. The energizing coil surrounds the 
center portion of the polepiece. 

Thetuning fork is made of a nickel-iron-molybdenum alloy? (vibralloy) 
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Fig. 2 — Exploded view showing individual parts. 


having controlled elastic and magnetic properties. Annealed permalloy 
with low coercive force and high permeability is used for the polepiece 
and shell to reduce magnetic flux changes. The materials and shapes of 
other parts are chosen to minimize dimensional changes with time and 
environmental conditions. 


III. FREQUENCY SELECTION AND FINE TUNING 


The range of resonant frequencies is obtained with tuning forks that 
have the same over-all length but varying free tine lengths. The small 
dimensions of these forks require the brazing fillets and the free reed 
lengths and thicknesses to be precisely controlled. By special attention 
to rolling of the reed stock, precise jigging of the reeds and base block, 
and brazing with minimum fillet dimensions, it is feasible to produce 
forks in which the individual tine frequencies are sufficiently close to 
chosen nominal frequencies spaced 15 cycles apart so that they may 
then be accurately tuned to these desired frequencies. 

Precise or fine tuning is accomplished with spring sliders that may be 
moved along the tines. This requires a slider that will stay in place 
under shock and vibration, will provide an adequate tuning range, and 
will allow the necessary fineness of frequency adjustment. This is 
achieved by means of small spring clips that snap on and ride along the 
edges of the tines. These sliders are shaped so that pressure at the center 
releases the force with which the slider seizes the reed and permits it 
to be moved. Each slider has a mass of about 1 milligram and provides 
a tuning range of about 10 cycles on forks near 500 cycles and of about 
25 cycles on forks near 1000 cycles. The sliders may be moved in incre- 
ments less than a thousandth of an inch, permitting the resonant fre- 
quencies to be readily set to a desired value within +0.05 cycle. The 
seizure forces are large so that shock and vibration acceleration in ex- 
cess of 1500 G are required to move the sliders. 
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IV. CONTACT FACILITY AND SENSITIVITY ADJUSTMENT 


The sensitivity is adjusted in manufacture by changing the contact 
gap separation. A fine rhodium wire having a resonance frequency above 
the frequency range of the tuning forks is supported by a loop of larger 
wire that may be rotated on a tapered stud through the frame. The 
fine wire is pretensioned with a prescribed force against the loop wire to 
form a lift-off type of contact that is accurately positioned and will 
follow large tine excursions without objectionable interference with the 
tine motion. This construction’ results in a contact that makes to a low 
resistance with the vibrating contact on the reed for intervals of time 
that may be 25 per cent or more of the reed period, depending on the 
applied power. The operating sensitivity of the selector is precisely set by 
rotating the loop on the stud axis and thereby causing the end of the 
contact wire to move toward or away from the reed contact. The point 
of contact is close to the axis of rotation so that a fine control of the 
contact gap may be achieved. 


Bandwidth Control 


The bandwidth or sharpness of the resonance curve is determined pri- 
marily by three dissipative factors, namely: internal frictional losses in 
the reed material, viscous losses in the air surrounding the reeds, and 
eddy-current losses in electrically conducting parts. The last factor has 
been chosen as the adjustment or control means for bandwidth. A 
copper washer is placed around the polepiece and where flux changes 
due to motion of the reeds induce eddy currents in the copper. By 
selecting the proper washer thickness and diameter and by setting the 
magnet strength to yield the proper flux density, eddy currents are 
developed when the tines vibrate that absorb energy and reflect into the 
system as an effective mechanical resistance that broadens the resonance 
curve by the desired amount. 


V. VIBRATING SYSTEM PARAMETERS 


Tabulated in Table I are some measured and derived data that show 
the magnitudes of the more important vibrating system constants of 
two selector samples with resonant frequencies nearly an octave apart. 
These are typical values that will be of interest to those concerned with 
the vibrational mechanics, electromechanical coupling, and other ana- 
lytical design factors. 
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Reed dimensions — length 
thickness 
width 

Effective reed stiffness 

Resonant frequency as brazed 

Resonant frequency with contact 

Resonant frequency with slider as 

tuned 

Ieffective reed mass as brazed 

Effective reed mass with contact 

Ieffective reed mass with slider as 

tuned 

Electrical impedance at resonant 

frequency 

Electrical blocked impedance at 

same frequency 

Electrical motional impedance at 

same frequency 

Current to just close contact 

Bandwidth 

Effective mechanical resistance 

of fork at resonance 


Electromechanical coupling fac- 
tor 

Effective magnetic gap stiffness 
(each gap —from_ frequency 
shift measurements) 

Corresponding gap flux density 

Maximum tine flux density (as- 
suming fringe flux equal to gap 
flux) 


Nominal Frequency 
517.5 cps 





1.4 em 

0.015 cm 

0.254 cm 

1.45 X 105 dynes/em 
560 eps 

530 cps 

517.5 cps 


0.0118 grams 
0.0130 grams 
0.0138 grams 


478 + 7231 
220 + j277 
258 — 746 
0.275 milliamps 


1.1 cycles 
0.19 mechanical ohms 


2.24 X 105 [5° dynes/ 
abamp 

—0.02 * 10° dynes/ 
em 


200 gauss 
4000 gauss 





Nominal Frequency 
997.5 cps 


1.01 em 

0.015 cm 

0.254 em 

3.88 X 105 dynes/em 
1068 cps 

1011 cps 

997.5 cps 


0.0087 grams 
0.0096 grams 
0.0099 grams 


448 + 7430 
235 + 7485 
213 — j55 
0.275 milliamps 


1.3 cycles 
0.16 mechanical ohms 


1.88 X 105 [7.2° 
dynes/abamp 

—0.02 X 105 dynes/ 
em 


200 gauss 
4000 gauss 





VI. PERFORMANCE OBJECTIVES 


Consideration of the over-all system operating requirements for 


personal radio signaling pertaining to such factors as the needed number 
of individual selections, practical radio receiver power levels, calling 
rates, and environmental conditions, led to the following objectives for 
the performance of the reed selectors: 

1) Nominal frequency range — 517.5 to 997.5 cycles. 

2) Nominal frequency separation — 15 cycles. 

3) Frequency deviation limits — 0.3 cycle, including adjustment 
tolerances, aging, shock, magnetic changes, and all other instabilities 
except those due to temperature changes. 

4) Temperature-frequency deviation limits — +0.2 cycle over tem- 
perature range of 35°F to 110°F (2°C to 438°C). 

5) Nominal bandwidth — 1.0 cycle. 
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6) Bandwidth deviation limits —0.8 to 1.4 cycles resulting from 
temperature changes and all other causes. 

7) Nominal current to just operate contact — 0.25 milliamps for a 
nominal 500-ohm coil impedance at resonance. 

8) Just-operate current deviation limits — +3.0 db resulting from 
temperature changes and all other causes. 

These objectives are mutually consistent in that the limits given in 
each case are as large as can be tolerated without reducing the limits on 
some other factor. There are other important design considerations that 
must not be neglected, such as weight, size and shape, contact life, 
shock tolerances, corrosion resistance, magnetic interaction and so 
forth, and with respect to which the selectors must, of course, be ade- 
quate. However, the above-tabulated characteristics are the most sig- 
nificant from an operating standpoint and are sufficient under marginal 
conditions to assure positive operation and avoid false signaling. 


VII. TYPICAL MEASURED DATA 


Presented below are measured data showing that the above-described 
reed selector meets these objectives. By means of the spring sliders, the 
two tine frequencies are made alike within a small fraction of a cycle 
and are given values that result in a combined fork frequency well within 
requirements. Attention is given in the assembly and adjustment pro- 
cedure to magnetically and mechanically stabilize the whole ‘structure. 
The magnet is stabilized well below its maximum remanence; the whole 
final assembly is subjected to a moderately high temperature to relieve 
residual stresses; and the tines are vibrated at a suitable level to bring 
them into a normalized magnetic state prior to final adjustment. The 
resulting selectors have resonant frequencies that will remain within 
+0.3 cycle from their nominal frequencies at normal room temperatures 
and under reasonable conditions of mechanical shock and electrical over- 
load. Negligible changes occur under shocks up to 1500 G (2 milli- 
seconds duration) or with input levels 20 db above the just-operate 
values. 

T'requency stability with temperature is achieved by making the 
forks of a nickel-iron-molybdenum alloy of such a composition that 
magnetic permeability changes are small and the temperature coefficient 
of Young’s modulus is low and of a magnitude to compensate for di- 
mensional changes with temperature. Operate current stability is real- 
ized by additional attention to the design geometry and materials so 
that changes in temperature cause variations in contact scparation that 
are a small fraction of a mil-inch. 
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Fig. 3 — Variation with temperature in the operating characteristics of a 
typical lower-frequency tuned reed selector. 


Tig. 3 and Fig. 4 are graphs of measured data showing variations with 
temperature in the resonance frequency, just-operate current and band- 
width of two typical samples, one at each end of the nominal frequency 
range. The range covered by these graphs is much wider than that 
required for most applications. In the more common temperature range 
of 35° to 110°I*, the deviations are well within the limits tabulated above. 

Tig. 5 and Fig. 6 are electrical impedance diagrams of the same two 
selector samples with resistance and reactance as coordinates and fre- 
quency as the variable parameter. This form of plot emphasizes the 
interesting values near resonance and may be used for analytical pur- 
poses.‘ I'rom these graphs, it can be determined that the conversion of 
electrical to effective mechanical power is about 46 per cent and that 
the available electric power necessary to just operate the contact is 
about 33 microwatts. 
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Fig. 4— Variation with temperature in the operating characteristics of a 
typical upper-frequency tuned reed selector. 


VII. NOMINAL OPERATING LEVELS AND TIMES 


The electrical power source supplying selectors in a system must have 
an available power capacity sufficient to cause dependable contacting 
under the worst temperature and adjustment conditions. These worst 
conditions obtain when the frequency deviation from nominal and the 
just-operate current are at their maximum values. Considering the 
limits permitted in these selectors and making allowance for contact 
quality and life with some statistical advantage taken of the small 
chance of all limiting conditions occuring simultaneously, it was deter- 
mined that the minimum electrical input power should be 6 db above 
that needed to barely close the contact of a nominal selector. At this 
level, the time required to close the contact after energizing the coil is 
equal to the time needed for the reed amplitude to decay below contact- 
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Fig. 5 — Vector impedance diagram of a typical lower-frequency unit. 


ing amplitude after the coil current is stopped. For nominal selector 
constants, this time is approximately 225 milliseconds. Input. levels 
higher than 6 db above just-operate will result in faster operating times 
and slower decay times, but the sum of the operate and decay times will 
increase less than 20 per cent up to input levels 12 db above the nominal 
just-operate value. 


IX. CONTACT CAPACITY AND LIFE 


The contact has greater capability than would at first seem likely. 
Such a light contact is most frequently used in circuits to change the 
potential on a tube or transistor and thereby trigger some desired sig- 
naling or switching function without the contact current exceeding a few 
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Fig. 6 — Vector impedance diagram of a typical upper-frequency unit. 


milliamperes. The contact closure is intermittent at a rate corresponding 
to the frequency of the selector, and the duration of the individual 
closures is a small fraction of a millisecond, depending upon the fre- 
quency and input level. These short closures, however, occurring at a 
rate of several hundred times per second, may control current pulses that, 
have an integrated or averaged power that is a substantial fraction of a 
watt. 

The maximum power that can be controlled depends mostly upon the 
reactive elements in the contact circuit and the life needed from the 
selector. As an example of what may be expected, Fig. 7 shows changes 
that occurred in the resonance frequency and the sensitivity of a typical 
selector when operated continuously (except for a few minutes about 
every 100 hours during check test) over a period of 1500 hours. The 


422 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 











RESONANT 


FREQUENCY 





FREQUENCY IN 
CYCLES PER SECOND 

















SENSITIVITY 








OPERATE CURRENT 
AT MOST SENSITIVE 
FREQUENCY IN MA 











ce) 200 400 600 800 {000 1200 1400 1600 
DURATION OF TEST IN HOURS 


Fig. 7 — Variation with time in the sensitivity and frequency of a selector 
closing a 12-volt battery through a 240-ohm resistor. 


electrical input was 9 db above the just-operate value, and the contact 
closed a 12-volt battery through a 240-ohm resistor, giving a closure 
current of 50 milliamperes. Throughout the test period the resonance 
frequency changed only slightly and the just-operate current increased 
about 20 per cent. This later change was due to erosion of the contact 
wire, which increased the contact gap. Erosion was minimized by con- 
necting the fine contact wire to the negative side of the battery. At the 
end of the test, the diameter of the contact wire was approximately half 
its original value. 


X. APPLICATIONS 


The manner in which these selectors are used in the circuits of the 
BELLBOY Personal Radio Signaling system will be described in a paper 
to be published on the pocket radio receiver. In this system, three tuned 
reed selectors are operated simultaneously in the receiver, and these 
trigger a transistor oscillator that gives an audible signal. The power 
controlled by the contacts in this case is small. 

The substantial power capacity of the contacts can be used to operate 
relays and other devices directly. Pulses of current from a battery at 
the selector frequency can be supplied to a smoothing or integrating 
capacitor, and the relatively constant voltage across the capacitor can 
be used to operate a sensitive de relay. The battery may be at the loca- 
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Fig. 8 — Reed selector actuated mercury relay for selective control of multiple 
functions requiring substantial powers. 


tion of the reed selector or may be supplied by superposition over the 
same circuit used to transmit the selector frequency. 

The contact may also be used as a synchronous rectifying means to 
generate de from the same ac source that operates the selector, as shown 
in Fig. 8. When the source frequency corresponds to that of the reed 
selector, the contact of the selector closes in synchronism once each cycle 
to send unidirectional pulses to the capacitor and relay in parallel. The 
capacitor smoothes the pulses and gives a nearly constant current in the 
relay winding. For maximum sensitivity it is desirable that the contact 
closures occur near the peaks of the supply voltage wave, and this is 
accomplished by connecting a large reactance (either inductive or 
capacitative) in series with the selector winding. This reactance also 
serves to attenuate the supply voltage applied to the selector winding 
to avoid overdriving the reeds, because a supply voltage large enough to 
operate a relay is ordinarily many times that needed to operate the reed 
selector. Combination circuits using reed selectors and mercury-wetted 
contact relays provide a simple means of selectively controlling sub- 
stantial powers to perform a multiplicity of functions over a single pair 
of wires. 

When operated just below the contacting level, these selectors have a 
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Q (resonant frequency-to-bandwidth ratio) in the range of 500 to 1000 
and therefore may be used effectively in a selective bridge or filter circuit 
as described in a previous paper.® The use of such a selective circuit in 
the feedback loop of a single transistor oscillator results in an attractively 
simple source of frequency having a precision corresponding to that of 
the selector. 
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An X-Ray Diffraction Study of the 


Structure of Guanidinium Aluminum 


Sulfate Hexahydrate 


By 8S. GELLER and H. KATZT 
(Manuscript received March 21, 1961) 


The Busing-Levy IBM 704 least squares program has been applied to 
three-dimenstonal X-ray diffraction data from crystals of guanidinium 
aluminum sulfate hexahydrate taken with the Bond-Benedict single-crystal 
automatic diffractometer. Indications of interactions between parameters 
were evident in the early stages of refinement and were not removed in the 
subsequent cycles. Strong interactions were subsequently corroborated by 
large values of many of the correlation coefficients of pairs of parameters. 
In this case these interactions prevent refinement. The correctness of the 
general features of the structure as given in a previous paper on the gallium 
tsomorph is nevertheless corroborated by the present investigation. 

To enable those who have had similar difficulties to compare results, a 
fairly detailed account is given of the course of the attempt to refine the 
structure. The effects of highly correlated parameters are emphasized. 


I. INTRODUCTION 


The purposes of the investigation to be described were manifold. An 
approximate structure of the isomorphous gallium compound has al- 
ready been reported.’ The gallium compound with the heaviest metal 
atom among the isomorphs appeared to be best for establishing the 
general features of the structure.” However, in the hope of finding a 
closer relation between the structure and its electrical properties, it 
appeared that a refinement of the structure would be very worthwhile. 
In such a case, one would wish to have all of the atoms of more nearly 
the same scattering power; thus the guanidinium aluminum sulfate 
hexahydrate (G.A.S.H.) compound seemed most suitable for this pur- 





{ The contribution of H. Katz to this work was made during a period of employ- 
ment at Bell Telephone Laboratories in the summer of 1959. 
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pose. Furthermore, this crystal would have the lowest linear absorption 
coefficient for all practical radiations; the importance of this feature will 
be discussed later. But probably most important, it was anticipated 
that the aluminum compound would be the one on which most measure- 
ments of various sorts would be made. This has indeed been the case. 

While our earlier paper’ was in press, a note’ appeared in Kristallo- 
grafia which gave an approximate structure for G.A.S.H. and its iso- 
morphs which differed from that reported by us. A check with our data 
indicated that the structure reported by Varfolomeeva et al.’ was incor- 
rect,’ but this did not mean that the structure reported by us was neces- 
sarily correct. We had to face the question as to whether the correct 
structure might lie between the two structures or as mentioned in our 
first paper, perhaps some subtle disorder existed in the structure. In any 
case the appearance of the other result gave additional impetus to 
completion of work that had been started several years ago. 

There is a further importance of this work. The quantitative X-ray 
data were taken with the Bond-Benedict single-crystal automatic dif- 
fractometer.* It is the only crystal so far studied with this equipment 
and perhaps is the first X-ray structure analysis to be based on three- 
dimensional data collected automatically. Thus at least a small part of 
this paper will be devoted to an assessment of this equipment and sug- 
gestions as to future plans. 

Perhaps the most frustrating experience encountered is to find inde- 
terminate a problem which has taken considerable expenditure of time 
and effort of various sorts. One such reported problem in the field of 
X-ray crystallography is that of the determination of the structure of 
tetragonal BaTiO; ; this problem was found by Evans’ to be indeter- 
minate by X-ray analysis, at the very least on the basis of the data 
collected. The results of the work on the three-dimensional data of 
G.A.S.H. indicate that the structure as originally reported by us is 
essentially correct. But we find that although a low discrepancy factor 
and standard error of fit are obtained by the least squares method of 
refinement, the structure cannot be refined; that is, convergence is not 
attained: there are parameter oscillations in each least squares itera- 
tion; some improbable interatomic distances and large error estimates 
are obtained. The cause appears to be strong interdependence of many 
of the parameters. 

In this investigation the correlation matrix is used to demonstrate the 
existence of the strongly interacting parameters. The importance of this 
approach has also been demonstrated by a recent investigation de- 
scribed in a paper written by one of us (S8.G.).° 
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Taste I — Larrick CONSTANTS OF GUANIDINIUM ALUMINUM 
SuLFATE HexAHYDRATE 








Investigators a,A cA 
Wood 11.77. + 0.04 8.98 + 0.03 
Ezhkova, et al 11.737 + 0.002 8.948 + 0.002 
This work 11.75 + 0.02 8.94 + 0.01 





II. CRYSTAL DATA 


Guanidinium aluminum sulfate hexahydrate, C( N Hz )3Al(SO4).-6H,O, 
is isostructural with the previously reported’ gallium compound. The 
morphology and unit cell dimensions have been reported by Wood.’ 
Lattice constants have also been reported by Ezhkova et al.* The central 
values of our lattice constants, obtained from careful measurement of 
Buerger precession camera photographs, differ from those reported in 
both of the aforementioned papers, but are in better agreement with 
those of Ezhkova et al.* For purposes of comparison, the variously re- 
ported values are listed in Table I. 

As described earlier,’ the most probable space group to which the 
crystal belongs is P31m and the unit cell contains three formula units. 
The molecular weight of the Al compound is 387.29, the volume of the 
unit cell is 1,069 A*, and the X-ray density is 1.804 g/cc. 


III, DETERMINATION OF THE STRUCTURE 


The determination of the structure has been described in the paper 
on the gallium compound. The evidence for the correctness of the general 
features of the structure described in that paper, including the orienta- 
tion of the guanidinium ions, is conclusive as will be shown subsequently. 


IV. EXPERIMENTAL 


The Bond-Benedict: single-crystal automatic diffractometer* was used 
to collect the three-dimensional data. Some changes from the original 
design of the instrument and in the electronics were made before the 
final data were taken. A detailed description of these changes must be 
left to the original authors. However, it should be mentioned that for 
these particular data (which were taken in 1956), a proportional counter 
replaced the Geiger counter and the “back-set”’ correction* was virtually 

{7 Dr. E. A. Wood and Mrs. V. B. Compton have informed us that their recent 


measurements of lattice constants of G.A.S.H. give values which agree more closely 
with those of Ezhkova et al.8 and of the present work. 
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eliminated by circuitry changes. Also, the internal geometry of the 
collimator was changed to square cross section. 

The need for a collimator with square cross section derived from the 
mechanics of the instrument. The ‘“back-setter’ produces a jarring of 
the goniometer head which could at times translate the crystal very 
slightly out of the original alignment in the X-ray beam. If the beam 
has a circular cross section, slight deviation from coincidence of crystal 
cylinder and rotation axes causes significant differences in intensity 
when the diameter of the crystal is large relative to the beam cross 
section. This is not true of a beam with a more or less square cross 
section. 

Of course, one would not have to worry too much about this if small 
crystals were being used. However, for this instrument and the use of 
the usual type of sealed X-ray tube, it is necessary to use large crystals 
to obtain the data. (This will be discussed further later. ) 

Two cylindrical crystals were used to obtain the data attainable by 
this instrument with Cua radiation and a pentaerythritol mono- 
chromator. The crystal aligned along the c-axis had a diameter of 0.67 
mm; the crystal aligned along the [20-1] direction (orthohexagonal 
A-axis) had a diameter of 0.54 mm. With a linear absorption coefficient 
for CuKa radiation of 48.7 cm", the values of wR for these crystals are 
1.64 and 1.82 respectively. 

As described in the paper by Bond,’ the single-crystal automatic 
diffractometer works on a principle similar to that of the equi-inclination 
Weissenberg camera. With CuKa radiation, seven levels were obtain- 
able about the c-axis and fifteen about the orthohexagonal A -axis. 

Data from a particular level n were collected as follows: The align- 
ment of the crystal was checked. This was done in two ways whenever 
possible. A microscope could be used to align the crystal cylinder axis 
with the rotation axis of the instrument. The equi-inclination angle was 
calculated and the crystal set to this angle. The arrangement of the 
counter of the instrument is always set so that the diffracted beam is 
incident perpendicularly to the window. Thus the counter is actually 
moved to twice the angle of the crystal from the zero level situation. 
If a particular reflection (for example, 00-/ on the /th level about the 
c-axis) was observable when the counter angle was equal to zero de- 
grees for a given layer, this reflection was used to readjust crystal and 
counter. 

To obtain the weak intensities, the diffraction unit settings were usu- 
ally 40 kv and 20 ma. To obtain the stronger reflections, proper settings 
of the voltage and tube current were made so as to record enough 
moderate reflections to establish a scale between the two patterns. 


X-RAY STUDY OF G.A.S.H. 429 


Integrated intensities, crystal angles and counter angles for each level 
were recorded automatically by the Leeds-Northrup two-pen recorder 
as described in the papers by Bond and Benedict.’ As indicated above, 
resetting was made manually for each new level. 

Following the collection of the data by the recorder, it was necessary 
to index the data: This was the most time-consuming (i.e., on a man- 
hour basis) part of the data processing required to obtain the observed 
amplitudes. The indexing was carried out with the use of the plotting 
device.* (The indexing problem will be discussed further later. ) 

Following the indexing of all the data, the usual absorption, Lorentz- 
polarization and Tunell’ rotation factorst were applied to extract the 
relative | F. |’. (The polarization correction is for monochromatized 
radiation.) The calculation was programmed for the IBM 704 by R. G. 
Treuting. The corrections calculated were based on the formulaet 
given by Bond and the tables used for the absorption corrections are 
those given in Bond’s paper.” The program written by Treuting put the 
resultant | F, |’s or | F, |’s out on cards as well as on a print-out. The 
individual Lorentz-polarization, absorption and Tunell rotation factors 
were also printed out for each reflection for each layer on which it 
appeared. 

Having extracted the | F, |”s for each layer about each of the two 
axes, the next step involved an iterative cross-calibration process to 
bring the values to the same kasis. An IBM 704 program written by 
W. R. Romanow allowed us to apply constant factors to the sets of 
| F, |’ put out by the intensity correction program. Romanow’s program 
also put out new cards so that we could apply a different constant to the 
new values if necessary. 

When we felt we had arrived at the best values, it was decided to 
carry out the subsequent least squares refinement on the basis of the 
| F, | values. Using a short program written by Romanow, square roots 
were taken of all the | 7, |”s and put out on cards. Those that came from 
layers about the orthohexagonal A-axis were then sorted on the values 
of / for ease in setting up the data for the least squares refinement. 

As described in the Bond-Benedict papers, some reflections do not 
get entirely into the counter; thus, in order to be sure that all are ob- 


+ The proportional counter employed had a linear response to counting rates of 
over 20,000 cps. Because for even the strongest reflections, observed counting rates 
over 10,000 cps gave integrated intensities which went off scale on the recorder, no 
dead-time correction* was necessary for any of the reflections. 

t The formula for P; on p. 380 of Bond’s paper should read 


. gq — sin? » 2q | 
— 4 1 a ee eee 2 jos eS. 1 : 
P, = T sin 2» || + (= Fe hot ) (1 + cos 26) i 7. + cos ayy 
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tained, the instrument was designed to obtain each reflection twice. 
Tor this reason the counter has a 4° window. Even at that, not all the 
reflections of a given form will have the same intensity, but usually 
about a twofold axis, a form of reflections of moderate intensity will 
have two with the same intensity. About a threefold axis, perhaps eight 
of twelve reflections from a given hk-l form will have the same intensity 
or 12 out of 16 of a given hk-0 form. Unfortunately, the weaker reflec- 
tions do not give as good results as the moderate to strong ones. In the 
case of the c-axis layers, if there was a variation in the height of peaks 
which appeared to have been fully in the window, the value taken for 
the integrated intensity was the average of the several peaks. In the 
case of the orthohexagonal A-axis layers, because there were fewer 
peaks contributing to a form and therefore a greater possibility that 
only one peak was squarely in the window, the value recorded in most 
cases was the measure of the highest peak. 

In taking the averages of observed structure amplitudes, the weighting 
was in accord with the above. For example for a given | Fyz.7 |, h,k,l ¥ 0, 
the value from the c-axis layer was weighted four times and a value from 
an orthohexagonal A-axis layer once. The standard deviation was cal- 
culated in accordance with the analysis given in Chapter 16 of the book 
by Dixon and Massey” and as suggested earlier by Ibers.” However, 
for the unobserved, the standard deviation was taken as equal to half 
the minimum observable. For | Fo.1 |’s which would have unity weight 
since they appear only once, the o was taken in accordance with a 
subjective estimate comparing the particular | /o.,| with others of 
similar value. The agreement between or among | /, |’s from the same 
form but from different layers was quite good generally except for the 
weakest reflections. 

In the CuKa sphere, there is a total of 895 X-ray forms of guanidi- 
nium aluminum sulfate hexahydrate. The geometry of the Bond diffrac- 
tometer allows us to observe only 842 of these. Of those possibly ob- 
servable by the instrument, only 546 were actually observed. 


Vv. ATTEMPT TO REFINE THE STRUCTURE 


Because the major point of this paper is to demonstrate that the 
refined structure under discussion is effectively unattainable from the 
X-ray diffraction data, it seems worthwhile to give some of the details 
of the calculations. To make such a discussion simpler, the pertinent 
data are collected in tables. In Tables II and IV the values of parameters 
and some other important information are listed. In Table II two col- 
umns are assigned to each cycle; the left one lists the starting parameters, 
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the right, the calculated “corrected”? parameters. A blank space in the 
left column indicates that the last previous calculated value was the 
starting value for the particular parameter. In the cases of cycles 9 and 
10, all of the parameters had the last previous calculated values of 
cycles 8 and 9 respectively. 

The order in which the atoms are listed in Tables II and IV is not 
the same as that of the paper’ on the gallium isomorph, but the atom 
labeling is. In writing the special position symmetry patch for the 
Busing-Levy” IBM 704 least squares refinement program, it is most 
convenient to list the atoms in general positions first. Then to avoid 
mistakes in the listing of results, it is best to leave the order the same 
as that of the output of the program. 

In the calculation of structure amplitudes the following atomic scat- 
tering factors were used: for O, Al’t, N and C, those of Berghuis ez al;” 
and for §, those of Viervoll and Ogrim.” 

In cycles 1 and 2, 895 reflections, all those representing independent 
forms and observable in the CuKa sphere, but perhaps not observable 
with the single-crystal diffractometer, were included. Eight of the 
parameters were scale factors, all of which were initially equal to 0.6667, 
one for each value of / from 0 to 6 and the eighth value for all the re- 
maining / values. Also in the first two cycles, isotropic temperature 
factors were used despite the fact that it was obvious that the thermal 
motions of the atoms in this crystal must be highly anisotropic. 

The starting structural parameters for the first cycle were those given 
for the gallium isomorph’ except for changes in the S and Al tempera- 
ture factors and the y-parameter of N(II), which was inadvertently 
taken as 0.418 instead of 0.333. Now it may be seen in Table II under 
cycle 1, that this y-parameter did not change as radically as one might 
have hoped, in fact as one might have expected, for an incorrect parame- 
ter. But the temperature factor of the atom did increase considerably, 
perhaps indicating that the atoms did not want to be at the positions 
indicated. On the other hand, the temperature factor of the N in the 
special position decreased considerably to a negative value as if to 
compensate for the other. This, in retrospect, was already indicative of 
strong interaction between the thermal parameters of these two atoms. 
Another important change was the large one, to —0.392, in the value of 
the O(III) z-parameter; this implies a very short S—O distance, 1.31 
A, in one set of the SO, groups. 

The estimated error of fit’’'® at the end of the least squares calculation 
of cycle 1 was very much lower than the first computed error of fit,” 
and it appeared that by readjustment of some of the temperature factors 
we could go a step further toward convergence before changing to aniso- 
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tropic temperature factors. Initially cycle 2 showed that even with the 
readjustment of temperature factors, the R valuet had dropped from 
0.473 to 0.303, the weighted F from 0.299 to 0.193. But the error of fit 
was higher than that estimated in cycle 1 on the basis, of course, of the 
parameters computed in that cycle, some of which were physically 
impossible (i.e., negative temperature factors). 

However, cycle 2 ended with an estimated error of fit somewhat lower 
than that of cycle 1. The N(II) y-parameter decreased toward the value 
which we believe to be the more nearly correct one, but the N(II) B 
value increased greatly and the N(I) B value became a large negative 
value. Also the x-parameter of N(I) decreased to imply an unlikely short 
C—WN distance. Changes in the § and AI positional parameters were not 
large but several oscillations occurred. The O(III) (atom 10) z-parameter 
returned to —0.400, but even this value implied a rather short S—O 
distance, 1.37 A. 

At this point, it seemed necessary to change to anisotropic thermal 
parameters. The Busing-Levy program will compute these from the 
isotropic thermal parameters using the following relations: 6, = Ba*’/4; 
Biz = (Ba*b* cos y*)/4; ete. 

The starting parameters were those computed in cycle 1 and adjusted 
for cycle 2 (see Table II). For cycle 3, a critical estimate of the reflec- 
tions really observable by the single-crystal automatic diffractometer 
was made. This resulted in the removal from the calculation of 43 
unobserved reflections, some of which had rather high calculated struc- 
ture amplitudes when compared with the respective estimated threshold 
values. Included in cycle 3 was a rejection test: that is, when A/o was 
>10.00, the reflection was not counted in the calculation of the R 
values or the standard error of fit, nor was it included in the least squares 
calculation. This reduced the number of Fx.;’s used in the least squares 
calculation to 790. (Unfortunately the R values and the calculated 
amplitudes computed in this cycle have been lost. ) 

The estimated error of fit resulting from the cycle 3 least squares 
calculation decreased from 4.99 to 2.30, an apparently tremendous im- 
provement. However, the still incorrect N(II) y-parameter did not 
improve; also the values of the N(II) thermal parameters greatly in- 
creased. The O(III) values still implied a short S—O distance. The 
C(1) z2-parameter indicated possible nonplanarity of the guanidinium 
ion in the special position, but this parameter also had an apparently 


+ Unless otherwise stated, the R value is that for the independent Faz.:’s, i.e., 
multiplicity is neglected. This is the R value calculated by the Busing-Levy pro- 
gram. 
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large estimated error, 0.0115, indicative of potential difficulty. Twelve of 
the atoms had calculated thermal parameters which did not satisfy all 
the criteria for physical reality (see Ref. 13). Therefore, for cycle 4 some 
of the thermal parameters had to be adjusted to satisfy these criteria. 
Also, the N(II) y-parameter was corrected. The R& value and error of fit 
decreased considerably since cycle 2, but the weighted R value increased 
slightly. The same rejection test as used for cycle 3 allowed 809 reflections 
to be included in the cycle 4 calculation. The least squares calculation 
led to an estimated error of fit of 2.23, not too different from that esti- 
mated in the previous cycle. 

In cycle 4, the values of the N(II) thermal parameters decreased, 
indicating that the high values had been caused by the wrong y-parame- 
ter. One would prefer to think, however, that the y-parameter should 
have tended to approach the correct value rather than to have the 
thermal parameters act as if the atom should be removed. This time the 
x-parameter of N(I) (atom 6) became rather large, implying too large a 
C—WN distance. A number of the other positional parameters showed 
oscillation, and again there were twelve atoms which had thermal 
parameters not satisfying the criteria for physical reality (Table II). 
The O(IID zparameter continued to imply a short S—O distance. 
The C(II) and N(ITI) atoms did not have the same values in z-parame- 
ter, nor did the C(I) and N(1I) atoms have the same z-parameter. Also, 
in this cycle many of the scale factors, especially ss , had almost reached 
their starting values after having at first decreased substantially. 

The necessary adjustments were made on the thermal parameters 
before cycle 5 was carried out. Also, the rejection test was removed. Iive 
reflections which appeared to have substantial contribution from the 54 
hydrogen atoms or to have suffered from extinction were given zero 
weight. Thus, of the 852 reflections, 847 were used in the cycle 5 least 
squares calculation. Because some of the initially estimated o(/’,)’s were 
really very small, a few of these also were readjusted. Initially the 
value was 0.198, while the weighted f decreased to 0.139, this latter 
reduction resulting mostly perhaps from the few adjustments made on 
the o(F,)’s. The error of fit for the 847 reflections was larger than for 
the 809 of the previous cycle. The calculated estimated error of fit after 
the least squares calculation did decrease, however. 

But in cycle 5 there was no improvement in the way the calculation 
was going. There were further oscillations, and, very important, the 
CUD—N(ID distance continuing from cycle 3 was short, whereas the 
C(D—N(D) distance continued to be long. Considering the guanidinium 
ion to be planar, the C—N distances were respectively 1.22 and 1.48 A, 
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the average is 1.35 Ain good agreement with the acceptable guanidinium 
C—N value 1.34 A.16 Again this indicated interaction between the N(IT) 
x- and y-parameters and the N(I) 2-parameter. Also, the parameter 
values of the S(1) and O(III) atoms still indicated an improbably short 
S—O distance. There were other indications of interaction: for example, 
the y-parameters of the O(V) and O(VI) atoms (2 and 8 respectively) 
behaved strangely, that of O(V) indicating an improbably large [SOu,] 
O—O distance, that of O(VI), too small an [SO.] O—O distance. 

It seemed at the time, however, that there might be other possibilities 
for explaining the course of events in the attempt to refine the structure. 
I’or example, there could be many reflections to which the hydrogen 
atoms would contribute, and, perhaps particularly because this is a non- 
centrosymmetric structure, the affected structure amplitudes were hav- 
ing a detrimental effect. Therefore, in cycle 6 all reflections for which 
sin’/\” < 0.0800 were given zero weight. Necessary adjustments were 
made in thermal parameters (Table II); the N(I) and N(II) positional 
parameters were readjusted each to yield the C—N distance 1.34 A; 
and the O(V) and O(VI) y-parameters were adjusted to yield more 
reasonable [SO,] O—O distances. The FR value for the 755 amplitudes 
(with nonzero weights) was 0.200, weighted R = 0.128 and error of fit, 
2.82. 

In the cycle 6 least squares calculation, only 43 parameters were 
varied: the scale factors and all positional parameters except the N(I) 
x-parameter. The estimated error of fit decreased to 2.38, but this cycle 
was also discouraging in that again there were oscillations and some 
rather large changes in parameter. The 8(1)—O(II]) distance continued 
to remain improbably short; the O(VI) y-parameter again implied too 
short an [SO,] O—O distance; and the values of the N(II) 2- and y- 
parameters implied a C(II)N—N(II) distance of 1.25 A, 

In the paper on the gallium isomorph,’ we had concluded that the 
arrangement of the guanidinium ions on the threefold axes were related 
to that at 8m to close approximation by 4,2,0 and 2,3,0 — (w,0,w; 
0,u,w; t,u,w). However, some doubt remained, and therefore it was 
decided to try some different orientations of the guanidinium groups. 

Tor cycle 7, the N(II) parameters were readjusted, presumably back 
to the starting parameters of cycle 6. However, a card-punch error 
(0.5333 instead of 0.5533) was made in the x-parameter. The N(I) 
parameter was set to —0.1130. This we shall call the (—,—) fT orienta- 

t This symbolism is derived as follows: The + orientations of N(I) are those for 
which in (z,0,z) of positions 8c, znc1) = kw where wu is very nearly +0.113. The + 
orientations of N(II) are those for which in (z,y,z) of positions 6d, zna1 = 4 + U, 


y = %. Thus (—,—) here means that ana) = — 0.118, gna = 0.220, ynary = 0.667. 
By symmetry the latter two are equivalent to 0.553 and 0.333 respectively. 
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tion. The positional parameters of O( VI) were also readjusted. The R 
value for the 755 reflections increased to 0.250, the weighted R to 0.187, 
and the error of fit to 4.10. In cycle 7 all scale and positional parameters 
were varied. At the end of the cycle, the estimated error of fit was 3.53. 
The C(JI)—N(II) distance again was too short, ~1.21 A; again the 
O(VI) y-parameter decreased from the adjusted value; the difference in 
the C(I) and N(1) z-parameters increased. Also again there were oscil- 
lations. The results of cycle 7 did not look promising. 

In cycle 8, the (+,+) arrangement of the guanidinium ions was tried 
with the other starting parameters the same as those used in cycle 7. 
In this case the R value for the 755 amplitudes was 0.231, weighted R, 
0.155, and error of fit, 3.40. Again only scale and positional parameters 
were varied. The estimated error of fit obtained at the end of the least 
squares calculation was 3.14. The results of this cycle looked promising. 
The C—N distances looked good; the O(V) and O(VI) parameters 
were not too bad. However, the S(I)—O(III) distance still looked 
improbably short. The agreement for individual amplitudes actually did 
not look as good as it did in cycle 6, but it was felt that perhaps some of 
this poorer agreement resulted from hydrogen contributions and/or from 
required changes in thermal parameters. 

It was decided to continue to cycle 9 using the values of scale and 
positional parameters obtained in cycle 8. The R value increased to 
0.240; the weighted R value decreased to 0.140; the error of fit was very 
close to that previously estimated. Despite this, the parameter results 
of this cycle (Table II) looked even better than those of the previous 
cycle, but the S(I)—O(III) distance continued to be improbably 
short. 

The scale and positional parameters resulting from cycle 9 were used 
in cycle 10. There was not much change in R, weighted F or error of fit. 
In cycle 10, all scale and positional parameters which had changed less 
than lo in cycle 9 were held constant and all thermal parameters were 
allowed to vary. The estimated error of fit at the end of the cycle was 
2.55. It appeared that the thermal parameters of the N(II) atom in- 
creased considerably as if trying to eliminate this atom, and as before 
this seemd to be an indication that the N(II) atom was not placed 
correctly. Also as if to compensate, the previously large 633 of N(1), 
0.01480, decreased to —0.00095. Eight of the atoms had thermal 
parameter matrices which were not positive definite. 

With this continued disappointment, another notion became more 
important. Was it possible that the structure given by Varfolomeeva 
et al? was correct? It seemed advisable to make the calculation with the 
model proposed by those authors. The results proved that the structure 
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cannot possibly be correct. The initial R was 0.559, weighted R, 0.473 
and error of fit, 10.38 for the 755 reflections. Examination of the calcu- 
lated and observed amplitudes showed a great many very large dis- 
crepancies indicative of an improbable structure. Only the scale and 
positional parameters were varied in the least squares calculation. 
Thermal parameters for the N atoms were those initially used in cycle 6. 
All other thermal parameters were essentially those obtained in cycle 10 
with necessary adjustments made. The initial and final positional 
parameters are shown separately in Table IIT. The estimated error of fit 
was 8.92, indicating no real possibility of convergence. The parameter 
changes were mostly drastic. The N(I) x-parameter, for example, would 
imply a C(I)—N(I) distance of 1.16 A. Interestingly enough, the 
S(1)—O(III) distance continued to remain very short. 

In cycle 12, the guanidinium ions on the two three-fold axes (1.e., at 
22 and 3 2) were turned 30° from their original positions. The thermal 
parameters were the same as those used initially in cycle 11 and are 
shown in the next to the last columns of Table II. The # value was 
0.238, weighted R, 0.154, and error of fit, 3.388 (the latter two being 
somewhat higher than for the starting parameters of cycle 10). The 
estimated error of fit obtained from the least squares calculation was 3.14. 
The results of this calculation did not look promising. The C(I)—N(I) 
distance was large; there was an extraordinarily large change in the 
z-parameter of O( VIII). Also, agreement of many individual amplitudes 
was poorer than for the very first orientation of the guanidiniums. In 
fact, from the calculations of cycles 7-10 and cycle 12, it had become 
apparent that the (+,—) orientation was indeed the best: It also ap- 
peared that disorder or rotation} of the guanidinium ions was highly 
unlikely unless very subtle. In the case of complete disorder or the 
equivalent free rotation, there would be no contributions from the nitro- 
gen atoms to the amplitudes Fy,.;,h — k #4 3n, exactly as in the case 
of the (+,-+) orientation. This alone makes it appear that the orig- 
inally reported’ (-++,—) orientation of the guanidinium ions was corrob- 
orated. 

In cycle 12, the normal equations and inverse matrices were obtained.” 
Examination of the inverse matrix showed that there were large values 
of correlation coefficients, pi; = b:;/~bib;; , for many pairs of parame- 
ters. A few examples are: 


+ Two reports!’!8 based on nuclear magnetic resonance investigations of G.A. 
S.H. mention the possibility of rotation of the guanidinium groups. We have 
learned (by private communication) from, and have been permitted to quote, the 
author, D. W. McCall, of one of these,!” that further investigation now indicates 
that this rotation is highly unlikely. 





TABLE 











Atom 
x 

1-N(II) 0.2200 
0.2047 

2- O(V) 0.3449 
0.3512 

3 - O(VI) —0.3211 
—0.3281 

4- O(IX) —0.4654 
—0.4639 

5 - O(X) 0.4647 
0.4647 

6-N(1) 0.1182 
0.0987 

7- O(VITI) —0.1804 
—0.13875 

8 - O(VIT) 0.1426 
0.1367 

9 - O(IV) —0.2912 
—0.3180 





Coordinates 


III — Postrionat PARAMETERS. CycLE 11 








Atom 
y z 
—0.3330 0.0000 Initial 10 - O(CIIT) 
—0.3304 —0.0305 Final 
0.1166 —0.31380 Initial 11 - Od) 
0.1250 —0.3122 Final 
—0.1137 0.2450 Initial 12 - O(I) 
—0.1123 0.2025 Final 
0.3272 0.3400 Initial 13 - S(I) 
0.3327 0.3434 Final 
—0.3391 0.5600 Initial 14 - S(II) 
—0.3422 0.5294 Final 
0 0.4500 Initial 15 - CdD) 
0 0.4124 Final 
0 —0.1208 Initial 16 - Al**(IT) 
0 —0.1073 Final 
0 0.117 Initial 17-C(D 
0 0.1260 Final 
0 0.4756 Initial 18 - Al3* (1) 
0 0.4471 Final 





Coordinates 


4 





0.3699 
0.3651 


—0.4412 
—0.4246 


0.4538 
0.4688 


0.3479 
0.3469 


—0.3174 
—0.3200 


Co) Cale 


2] Goliad 


oo co 





oo o0 c0O co co 


Cola colts 


calro ca) 


oo co 


—0.0820 
—0.1185 


0.2764 
0.2739 


—0.3260 
—0.3351 


—0.2433 


—0.2502 


0.3144 
0.38079 


0.0000 
0.0110 


0.4430 
0.4396 


0.4500 
0.4694 





‘H’S’V'D GO ACOLIS AVU-X 


LEV 
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Zo(v) — 20a, 0.81 
Zo(v) — 2oWs 0.58 


Zoax) — Zo(vin, 0.84 


ZO(IID» 0.65 


Zo(Iv) 
2s) — &s(1D)y 0.96. 


It is noteworthy that the correlation coefficient for ¢ty¢qy—tnqy was 
very low, 0.10; it will be seen later that this low value resulted from the 
incorrect orientation of the guanidinium (II) ions. 

It seemed unlikely that the weighting scheme could be the cause of 
the difficulties encountered. Nevertheless, it was decided to try a 
weighting scheme radically different from that used in the first twelve 
cycles. 

In cycle 1’ (Table IV), all amplitudes with sin’@/\” < 0.0800 were 
still weighted zero. Also all unobserved amplitudes were to be weighted 
zero and all observed, unity. However, a number of amplitudes which 
should have been weighted zero were weighted unity, and a few which 
should have been weighted unity were weighted zero. This left 534 
reflections included in the least squares calculation. The initial parame- 
ters were those from cycles 9 and 10, except for the N’s which were 
started at the exact (+,—) orientation and the O(III) z2-parameter 
which was started at —0.405 to give an S—O value closer to 1.48 A. 
The RF value was 0.204, weighted R, 0.149 and error of fit 2.19 for the 
534 amplitudes and these parameter values. The least squares calcula- 
tion gave an estimated error of fit of 1.90. Again theS(I)—O(III) distance 
decreased to 1.38 A, the C(I)—N(I) distance increased again and the 
C(ID—N(II) distance decreased again. Some of the other distances are 
listed in Table V. 

Starting with this calculation, the vector v; = 3(+/wD;) (./wA) was 
obtained as output? as well as the direct and inverse matrices,” the 
purpose being to see whether Ap,’s from the diagonal term approxima- 
tions would be much different from those obtained by the exact solution 
of the normal equations. Not many of these were checked in this and 
subsequent cycles, but enough differences were found to indicate the 
importance of the off-diagonal terms. 

It appeared that it would be most convenient to have the correlation 
or normalized inverse matrix to examine in each cycle. A program patch 
to enable us to do this was written by Misses D. C. Leagus and B. B. 
Cetlin. 


t The program patch for this calculation was written by Miss D. C. Leagus. 
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TaBLE V — Some INTERATOMIC DISTANCES OBTAINED FROM 
LEAST SQUARES CALCULATIONS (SECOND Set or WEIGHTS) 

















P Cycle 1’ Cycle 2’ 
Distance A nN 
C(I)—N (1) 1.40 1.43 
C(LI)—N (II) 1.29 1.25 
S(I)—O(V) 1.46 1.44 
8(1)—O (III) 1.38 1.38 
S8(1)—O (1) 1.46 1.44 
S(IID)—O(VI) 1.47 1.48 
8 (II)—O(IV) 1.48 1.50 
SdI)—O (I) 1.50 1.49 
Al(I)—O(VID 1.92 1.89 
Al(I)—O(VILD) 1.86 1.86 
Al(II)—O(IX) 1.90 1.92 
Al(IL)—O(X) 1.91 1.91 








In cycle 2’ the starting parameters were the same as those resulting 
from cycle 1’ (new weights) except for the z-parameters of N(I) and 
N(II) and the z-parameter of N(I). Also, it was found that under the 
conditions set for the weighting in cycle 1’, only 496 amplitudes should 
have been weighted unity. For these reflections and the starting parame- 
ters shown in Table III, the R value was 0.176, weighted R, 0.119 and 
error of fit, 1.85. Again only scale and positional parameters were allowed 
to vary. Changes were not large except for the N and C(II) parameters. 
Some distances calculated from these parameters are given in Table V. 
(C—N distances are always on assumption of planarity of the guani- 
dinium group.) Note that again the C(I)—N/(I) distance is short, the 
C(UIID—N(II) long, but the average is the expected value for such a 
bond. Also noteworthy is the continued tendency of S(1)—O(IID to 
be short. In fact, there is a tendency throughout for the S(1)—O dis- 
tances to be shorter on the average than the S(II)—-O distances. Ex- 
amining the correlation matrix for this cycle we may summarize the 
results as follows (Table VI). Only those pairs for which | p | = 0.40 are 
listed. Thus of the 946 p:; (¢ ¥ 7) terms only 75 are 20.40. Important 
also is the fact that a large number, 677, of the terms are less than 0.10, 
many much less than 0.10; 194 of the | p:;| lie between 0.10 and 0.40. 
These could be important especially if one parameter has many inter- 
actions of moderate size with other parameters. 

Earlier we gave some examples of | p;; | that were calculated from the 
inverse matrix of cycle 12 (old weights). It is seen from examination of 
Table VI that the values for the particular | p;; | obtained from cycle 2’ 
are essentially the same except for the value for the tyqqy—tnqp in- 
teraction. The value is much higher, 0.62, than the one, 0.10, obtained 


TaBLE VI — CorreEtatrion CorFFICIENTS FROM CyYcLE 2’ (ONLY | pij| > 0.4 Ange LisrEp) 

















|p| —fi,jz,ja,°°° 
0.40-0.50 11-17,20,23,27,29,37; 14-41; 17-20,23,27,29,31,35,43; 20-25,33,37; 23-31,33,37; 25-29,39; 27-31,37,48; 29- 
31,33,37; 31-37; 33-389; 35-37 
0.50-0.60 9-10; 11-39,41,43; 12-15; 14-17,37; 17-39,41; 18-19; 20-31,43; 21-22; 23-43; 27-29,39,41; 29-39,41,43; 
31-41,43; 33-41; 37-39,41,43 
0.60-0.70 9-24; 11-25; 13-34; 16-36; 18-28; 20-23,27,35,39,41; 21-26; 23-29,41; 31-33,39; 38-40 
0.70-0.80 23-39; 39-43; 41-43 
0.80-0.90 14-35; 17-37; 20-29; 23-27; 42-44 
0.90-1.00 39-41 
Parameter Numbers 
Parameter 81 Se 83 84 S85 86 87 83 N(ID:2z y zZ 
Number 1 2 3 4 5 6 7 8 9 10 11 
O(V):z y zZ O(VD:2z y zZ O(IX):2 y Zz 
12 13 14 15 16 17 18 19 20 
O(X) ia y zZ N(1):2 z O(VIII):z z O(VII):2 Zz O(IV):2 2 
21 22 23 24 25 26 27 28 29 30 31 
O(IID):2 zZ O(II):2 z O(L):2 zZ S(L:z z S(II):z zZ C(II):z 
32 33 34 35 36 37 38 39 40 41 42 
Al(I):z | C(D:2 
43 44 























OFT 
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from the incorrect orientation of the guanidinium ions. Thus, zncorrect 
values for parameters can uncouple parameters. Furthermore, this ap- 
pears to be the reason that there was not much change in the incorrect 
y-parameter of N(II) in the first three cycles. That is, a parameter which 
is given a value which tends to make it independent may not change 
rapidly to a value which tends to make it dependent. 

The purpose of the next cycle was to see the results of allowing the 
parameters, both positional and thermal, of only the N and O(IIT) 
atoms to vary. Before carrying out this calculation, however, the posi- 
tions of hydrogen atoms were estimated. The guanidinium ions were 
considered to be essentially planar, and the z-parameters of the guanidi- 
nium H’s taken as 0.55 for those about the threefold axes at 4,3 and 
2.4, and 0.505 for those about the axis at 0,0. ’or the water molecules, 
the links with the SO, oxygen atoms were considered and the tilt of the 
water molecule estimated accordingly. In any given level of H,O mole- 
cules about either of the nonequivalent axes, the 2’s were taken equal. 
The H—O—H angle was taken as 105° and the O—H distance, 0.96 
A. (The initial H-parameter values will not be listed; however, the 
last set used will be listed later.) First, H contributions to the Fy..; for 
h,k,l positive were calculated for two different orientations of the 
guanidinium ions, namely: (+,—) and (+,+). (The program used 
for this calculation was written by R. G. Treuting; the atomic scattering 
factors for H were those of Viervoll and Ogrim.’””) These calculations, 
together with consideration of previous calculations of the amplitudes, 
corroborated the conclusion that the (+,—) orientation was the most 
probable one. 

The N-purameters were readjusted to yield the most probable C—N 
distance, and the z-parameter of O(III) was started at —0.405. Those 
observed amplitudes with sin’@/d” < 0.0800 which were not strongly 
affected by extinction were reweighted unity. The total number of re- 
flections weighted unity was 568. The H atoms were put into the calcu- 
lation as “fixed atoms” (see Ref. 13) with isotropic temperature factor 
B = 3.00 A®. The over-all R value was 0.177, weighted FR, 0.117, and 
error of fit, 1.90. 

The results of the least squares calculation are given in Table IV 
cycle 3’. It is seen that the O(III) z-parameter returned to that of the 
previous cycle. The N(I) x-parameter increased somewhat, implying a 
C(I)—N(I) distance of 1.37 A. The parameters of N(II) imply a 
C(II)—N(II) distance of 1.33 A. 

In Table VII, we list those correlation coefficients greater than or 
equal to 0.40. If this table is compared with Table VI, one finds that the 
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coupling of N(J) and N(II) positional parameters is still as strong as in 
the previous cycle. In both cycles 2’ and 3’, the correlation matrices 
showed no strong interaction between O(III) and nitrogen atom 
parameters. The correlation matrix of cycle 3’ indicated that there are 
some very strong interactions in pairs of thermal parameters. As ex- 
pected, there was corroboration of a strong interaction between the 
B33’s of the N atoms. 

Tor this case, it might be worthwhile to show the Ap,’s obtained from 
the complete solution of the 21 normal equations compared with those 
obtained from the diagonal term approximation. These are given in 
Table VIII together with the o’s calculated by the Busing-Levy pro- 
gram. As expected, several of the Ap,’s for particular 7 are quite different, 
particularly for those which are highly correlated (see Table VII). 

Before proceeding to the next cycle, the calculated and observed data 
were examined for any outstanding discrepancies and rechecks were 
made on the intensity data. It was found that 27 of the reflections which 
were listed as observed should have been listed as unobserved. It was 
also found that 5 reflections which were recorded as unobserved should 
have been observed by the instrument but were missed. These were 
obtained from film data. 

Slight changes were made in the H-parameters; the x-parameter of 
N(1I) was returned to 0.113 and necessary changes made in the 6. and 
B13; thermal parameters of N(I). Now the Busing-Levy program calcu- 
lates and stores all derivatives, so that it is possible to allow different 
sets of parameters to remain constant and solve for sets of Ap; for each 
initial set of parameters. In cycle 4’a, therefore, we first allowed only 
the N(I), N(II), and O(III) parameters to vary and then in 4’b, 


TABLE VII — Corre.ATION COEFFICIENTS FROM CYCLE 3’ 























lp| i,j 

0.40-0.50 1,2 7,12 8,9 8,15 

0.50-0.60 4,7 

0.60-0.70 1,10 5,7 5,14 . 

0.70-0.80 3,11 6,13 

0.80-0.90 

0.90-1.00 12,14 18,20 

Parameter numbers 
x y z Bit B22 B33 Biz Bis B23 

N(II) 1 2 3 4 5 6 7 8 9 
N(1) 10 11 12 13 14 15 


O(III) 16 17 18 19 20 21 
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TaBLe VIII — ParaMeTerR CHANGES AND Error ESTIMATES 
FROM CYCLE 3’ 











Parametesrumber | Busing-Levy pao iee Hee 
1 —0.0029 —0.0020 0.0027 
2 —0.0056 —0.0027 0.0022 
3 0.0058 0.0046 0.0037 
4 —0.00661 —0.00105 0.00268 
5 0.00270 0.00418 0.00245 
6 0.00231 0.00477 0.00420 
7 —0.00273 —0.00136 0.00202 
8 —0.00227 — 0.00333 0.00271 
9 —0.00371 —0.00053 0.00225 

10 0.0048 0.0051 0.0025 
11 0.0057 —0.0030 0.0048 
12 0.00411 0.00023 0.00296 
13 — 0.00124 0.00172 0.00527 
14 0.00614 0.00113 0.00466 
15 —0.01098 —0.00983 0.00552 
16 —0.0003 —0.0009 0.0014 
17 0.0121 0.0120 0.0023 
18 0.00092 0.00160 0.00228 
19 —0.00216 —0.001383 0.00311 
20 —0.00145 — 0.00230 0.00307 
21 0.00419 0.00025 0.00351 





varicd all parameters except the scale factors. The results are shown in 
Table IV. Again in both cases, the N(I) 2-parameter increased; there 
were changes in the N(II) parameters, but the implied C—N distance 
1.35 A was good. Also the z-parameter of O(III) seemed to improve, 
especially when all the parameters were allowed to vary. But in 4’a, 
the thermal parameter matrix of the N(I) atom was not positive defi- 
nite, while in 4’b, seven atoms had thermal parameter matrices which 
were not positive definite. Also there were continued oscillations and 
large error estimates. It was evident that real convergence would not be 
attained. 

However, because the N and O(III) parameters did look encouraging, 
it was decided to try one more cycle. This time the parameters of the 
water hydrogen atoms were recalculated in a somewhat different way. 
In a recent paper,” Aleksandrov, Lundin and Mikhailov report results 
of a study of the distribution of hydrogen atoms in guanidinium alumi- 
num sulfate hexahydrate by means of proton magnetic resonance experi- 
ments. They report that the nearest neighbor p—p (proton-proton) 
vectors are perpendicular to the a; , a2 and a3 axes.f They argue that on 
the basis of symmetry considerations all H atoms bonded to O’s in a 

+ Previously, Spence and Muller!’ had reported this to be so for the p — p 


vectors of the water molecules, but had concluded that the p — p vectors of ,the 
guanidinium groups could be parallel to the c-axis with a separation of 2.05 A. 
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single octahedron layer about a threefold axis must have the same 
z-parameter. Of course, this is true only for those hydrogen atoms 
bonded to N(I) atoms and to the water molecules about the threefold 
axis at 0,0. The trigonal axes and planes of symmetry are such that only 
three atoms about the axis at 4,3 and three about the axis at 2,4 must 
have the same value of z. 

Thus contrary to the statements of Aleksandrov et al,” symmetry 
conditions do not require all the nearest neighbor H—H vectors to be 
parallel to the (00-1) plane, nor must they all be perpendicular to the 
a, ad and as axes. Only for those about the threefold axis where the 
mirror planes intersect, namely at 0,0 must this be the case. However, 
it is possible that the nearest neighbor H—H vectors about the three- 
fold axes at 4 2, 2 4 are close to parallelism with the (00-1) plane and 
perpendicularity to the a, , a2, a3 axes. 

Furthermore, Aleksandrov ef al” refer to the trial structure reported 
by Varfolomeeva et al.’ Although that structure is incorrect, it would 
have no noticeable effect on the conclusions of Aleksandrov et al, since 
they discuss only the nearest neighbor H—H vectors. 

Thus, in calculating the H parameters, the tilting of the water H—H 
bonds out of the (00-1) plane and skewness to the a; , az , a3 axes was 
permitted in those water molecules about the threefold axes at 4 3, 3 4. 
(The guanidinium ions, however, were assumed to be planar.) In calcu- 
lating the H positions, the water molecules were assumed to lie in the 
planes connecting the water oxygen atom with the two sulfate oxygen 
atoms involved in the hydrogen bonding. The bisector of the H—O—H 
angle of 105° was taken as the line passing through the center of the 
water oxygen atom and the center of the line connecting the two sul- 
fate oxygen atoms involved. The parameters of the N and O atoms 
involved were those from cycle 4’b. The H-parameters thus deduced are 
listed in Table IX. The new parameters caused some differences in the 


TABLE IX — H Parameters USED IN FINAL CYCLE 











Description x y g 
on N(I) (atom 6) 0.205 0.086 0.51 
on N(II) (atom 1) 0.465 0.256 0.56 
0.564 0.434 0.56 
on O(VITI) (atom 7) 0.139 0.218 —0.148 
on O(VIT) (atom 8) —0.072 0.134 0.156 
on O(IX) (atom 4) 0.457 0.257 —0.124 
0.526 0.400 —0.111 
on O(X) (atom 5) —0.452 —0.260 0.205 
0.464 0.588 0.219 
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contributions to several amplitudes, but in general not very important 
ones. 

Some necessary adjustments of thermal parameters resulting from 
cycle 4’b were made. In cycle 5’a,f only those positional parameters 
were varied in which changes greater than o/5 occurred between previ- 
ous cycles 2’ and 4’, all thermal parameters were varied in which there 
were changes greatcr than ¢/5 between cycles 1’ and 4’; all scale factors 
were kept constant. In 5’b, only those parameters were varied in which 
changes in 5’a were greater than o/5. In 5’c, only positional parameters 
were varied. In 5’d all parameters were varied. All results are listed in 
Table IV. Differences range from very small to very large and are in- 
dicative of the unattainability of convergence. We list also the o’st in 
the Ap,’s for the last cycle 5’d in the last column of Table IV. These 
are especially large for most of the thermal parameters and for most of 
the z-parameters, and reflect the strong interdependence in pairs of 
parameters. 

The correlation matrix{ for cycle 5’d contains 6,670 p;;(¢ # j) terms. 
Thus we shall again only list the values of | p;;| 2 0.40 (Table X). Of 
the 6,670 terms in the matrix, 176 have values greater than 0.40; 1,389 
have values greater than 0.10. 

On examining Table X, one finds that no interactions of scale factors 
with positional parameters are listed. In fact, the correlation coefficients 
for such combinations are all very low. However, there are all the other 
types of interactions, namely: scale factor-thermal parameter, thermal 
parameter-thermal parameter, positional parameter-positional parame- 
ter, and several (those with asterisk) positional parameter-thermal 
parameter. Most often, also, the interdependence is between analogous 
parameters; for example, a z-parameter of an atom interacts with 2- 
parameters of other atoms. Even when a positional parameter inter- 
acts with a thermal parameter, an analogy exists, e.g., a z-parameter 
interacts with a @33-parameter. his makes physical sense, of course, 
and gives us some confidence that the correlation coefficients reflect the 
structural interdependence of the parameters. Correlation may be 
caused partially by the experimental technique§ but it is unlikely to 
result mainly from the ill-conditioning of the normal equations by a 

+ It should be kept in mind that all cycles 5’ refer to the derivatives evaluated 
with the parameters of cycle 5’a. 

t It is worth emphasizing that statistical theory precludes the use of the error 
estimates or normal equations matrix to determine the statistical significance of 
the parameters listed. Only if convergence is actually attained can these numbers 
be so used. Nevertheless, in a practical way, the error estimates and correlation co- 
efficients do give us important information in the course of refinement or, as in the 


present case, relative to the unattainability of convergence. 
§ X-ray vs neutron diffraction. 
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TABLE X — CORRELATION COEFFICIENTS FRoM Cyc LE 5’d.t 
(Onty | pi;| > 0.40 Are Lisrep) 





|e | i—fi,jr,J3,°°° 





0.40-0.50 | 3-4,5; 4-105; 6-75; 7-75; 8-75; 11-38, 47,61,67,85; 15-56; 18-19,34; 
20-47t,61,97,103; 21-30; 24~30,86,88; 25-26,27; 29-47,61; 30-33; 
36-37; 38-61,103; 40-51,62; 42-68; 43-661; 44-461; 45-46; 47-55, 
67,73,85,91,103; 51-64; 52-53; 55-97,103; 57-59; 61-73,79,85,91; 
63-69 ; 67-103; 75-81 ; 77-7983; 79-83t,97,103; 81-83; 97-108,114:; 
103-108,114; 105-113; 108-110t; 113-116 


0.50-0.60 | 4-5,6,7,8; 5-113; 8-99,113; 11-97,103; 12-15; 13-15,58; 16-59; 
19-84; 20-91; 22-24; 26-89; 27-941; 29-85,97,103: 30-94; 37-53t: 
38-47,97; 39-42; 41-50,63,69; 43-71; 47-97; 48-51; 50-69; 51-62; 
61-97,103; 67-97; 73-83t,97,103; 85-91,97,103; 110-115 


0.60-0.70 | 5-105; 13-56; 14-57; 20-29; 21-24,88; 23-86; 27-92t; 28-90; 30-92; 
37-46; 40-42,49,68,70; 49-64; 52-65; 72-78,801,82t; 74-78t; 76- 
781; 86-88; 96-104t,106t; 98-102t; 100-102t; 108-115t; 111-116 


0.70-0.80 5-6,7,8; 8-105; 9-54; 18-27; 21-86; 32-93; 35-95; 36-66; 45-60; 
49-51,62; 50-63; 96-102; 108-114; 111-113t 





0.80-0.90 | 6-7,8; 11-55; 25-91; 38-67; 62-64; 73-79; 98-100 
0.90-1.00 | 7-8; 47-61; 56-58; 68-70; 74-76; 80-82; 92-94; 97-103; 104-106 





t See last column of Table IV for parameter numbers. 
t Positional-thermal parameter correlation. 


reasonable but not necessarily ideal weighting technique. It will be 
noticed also that the same pairs of parameters show very nearly the 
same measure of interdependence as indicated by earlier calculations, 
again corroborating the point that it is the structural model (including 
atomic form factors) which causes the interactions. 

For the sake of completeness, we show in Table XI a list of observed 
amplitudes compared with those calculated from the parameters used 
initially in cycle 5’ and including the contributions of the H atoms with 
parameters shown in Table LX. Including consideration of multiplicity 
and the differences when calculated amplitudes are greater than the 
threshold values (with half the threshold value included in the denom- 
inator) for reflections not observed, the discrepancy factor is 0.11.T 

The over-all agreement is quite good despite several discrepancies in 
which a calculated amplitude is above the threshold value for an unob- 
served reflection.f Table XI attests to the validity of the conclusion 
that the general features of the structure are correct. 

+ Six amplitudes, those of reflections 30-0, 11-1, 21-1, 22-1, 42-1 and 21-2, suffer- 
ing from extinetion were excluded in calculation of this discrepancy factor. 

t These are a product of the instrument which sometimes missed reflections, 


which, according to visual estimates of photographic intensities, it should not. have 
missed. 
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VI. FURTHER COMMENTS ON THE INDETERMINACY OF THE EXACT STRUC- 
TURE OF GUANIDINIUM ALUMINUM SULFATE HEXAHYDRATE 


6.1 Importance of the Weighting Procedure 


The use of two very different weighting procedures did not break down 
the high correlations existing between parameters. It is doubtful, 
especially in the case of so large a number of parameters, that any 
reasonable weighting procedure would succeed in uncoupling the parame- 
ters sufficiently to lead to greater determinacy. 


6.2 Effect of Keeping Some of the Parameters Constant while Allowing 
Others to Vary 


In the case that there is correlation between parameters, it would seem 
that, at least in the final stages of the refinement, holding of such parame- 
ters constant could lead to erroneous results. In a case involving a smaller 
number of parameters it might be possible to obtain a confidence region® 
for all the parameters by holding some of the parameters constant, but 
at several different values. For example, suppose the problem involves n 
almost independent parameters and two almost completely dependent 
parameters which appear to prevent convergence. Choosing several 
judicious values of one of the latter and making the calculation for each 
one will give sets of values for the other parameters which will allow the 
construction of the equiprobability ellipsoids. 

However, in a problem involving many parameters, and many large 
and multiple correlations, such a technique would appear to be im- 
practical. It should be mentioned that if the model were very nearly 
linear, only those correlations very near +1 would be important in the 
unattainability of convergence. However, it is possible that the more 
nonlinear the model, the more important the other correlations become. 


6.3 Possible Effects of Increasing the Number of Observed Data 


There are two ways in which the number of data might be increased. 
One is to obtain more of the weak intensities by increasing the detector 
sensitivity. It does not seem that this would have the effect of decreas- 
ing the correlations. This was shown to some extent by the calculations 
based on the two different weighting schemes. In the first case the 
weighted evaluated derivatives for unobserved reflections were included; 
in the second, these were given zero weight and therefore excluded. 
Also, the exclusion of reflections for which sin’@/A” < 0.0800 did not 
have an apparently significant effect on the correlations. (Compare, for 
example, analogous values in Tables VI and X.) 
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The other way in which to increase the number of data is to use shorter 
wavelength radiation. Now, it is not necessary actually to measure these 
data before determining the effect on the correlations because the cor- 
relation coefficients, as calculated, depend only on the model and the 
evaluated derivatives. It is unlikely that the situation would change very 
much if the additional terms were included because the relationship of 
the derivatives with respect to correlated parameters would probably 
not change very much. . 

In the case of tetragonal BaTiO; ,°° higher index reflections would 
have almost no important contributions from the oxygen atoms. Thus 
the interactions among oxygen ion parameters will not be affected. 
Similarly, interactions among the metal ion parameters will probably 
not be much affected. But interactions between the two groups could be 
reduced. However, in the case of an all light atom structure, it would 
appear that the extra data would probably not reduce the correlations. 


6.4 Possible Hffect of Greater Accuracy in Measurement of Observed In- 
tensities 


The effect of greater accuracy in measurement of the observed in- 
tensities is not really predictable in this case. To be sure, in each iteration 


the reduction of s = W3(+/wA)*/+/m — n would reduce the apparent 
size of the equiprobability surfaces. This we certainly know. 

However, we must ask first whether there is a limit to the accuracy of 
the observed amplitude. One would suspect that there is such a limit. 
Furthermore as pointed out by Caticha-Ellis and Rimsky,” there will 
always be a discrepancy between the calculated and true values of the 
amplitudes. Thus s has a lower positive limit. 

Reduction of s would not only decrease the size of the equiprobability 
surfaces (and therefore, of course, the standard estimates of error) but it 
would also decrease the components of the vector v, vj = =(~/wD;)- 
(1/wA), where the D; are the evaluated derivatives. Thus, for example, 
if cycle 5’d were repeated with each A decreased to 3 of its value, each v; , 
and therefore each Ap; = >_; b,;v; would be reduced to the same extent. 
Of course an average reduction of 4 might not do the same thing. In fact, 
with a poor distribution of the reduction in A, the Ap; in some cases 
could even be larger, depending on the algebraic values of the D;. 

Actually the nature of the shape of the equiprobability surfaces might 
give the best clue to what might happen if,increased accuracy of measure- 
ment were attainable. The nonlinearity of the model would probably 
play an important part. The more nonlinear, the more important are apt 
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to be those correlations which are not perfect. Of course, even one perfect 
correlation +1, renders the whole problem indeterminate’ if insistence 
is made on allowing all parameters to vary in an iteration. This is not 
necessary, however, and one’ could learn a great deal about the parame- 
ters of a structure which has only one perfect correlation and the rest 
very small ones (see Section 6.2). In the present case, there are many 
correlations having absolute values between 0.90 and 1.00 (Table X). 
These have the specific values: 0.917, 0.905, 0.918, 0.907, 0.975, 0.963, 
0.901, 0.979, and 0.902, respectively. Perhaps the most important ones 
are the three closest to unity. 

In the case of gross nonlinearity it seems possible that these and so 
many of the other high correlations of Table X could cause unattain- 
ability of convergence even if the lowest limit of s were attained. That is, 
the shape of the equiprobability surface may be such as to prevent the 
practical attainment of separate estimates of the parameters (see also 
Ref. 21) from the given data. This seems to be true of the BaTiO; 
case.”"° 

Needless to say, a measure of doubt remains. Further work might aid 
in removing this doubt. This would involve trying to obtain more data 
and of greater accuracy, and further calculations. Our doing this is not 
presently contemplated. | 


6.5 Fourter Synthesis vs Least Squares 


In the case of tetragonal barium titanate, Fourier synthesis produced 
no improvement on the least squares method.” It is likely that with the 
present data, the situation in the case of the G.A.S.H. would be the 
same. On the other hand, there is no requirement of linearity in the 
Tourier synthesis: the actual amplitudes are the Fourier coefficients. 
In the least squares technique, an approximation is used: L.e., 


Fruxi(pi,po,++;Dn) = Faei(pr + Api, Po + Ape,+ ++ yin + ADn) 


= Fra(pi,pe,+*+ Pn) + D0 ail Ap; 
j=1 OP; |3; 
where 1 ,f2,°*',fn are approximate but nearly true values of the 
parameters. It is possible that higher order terms could be important 
here, but it is not clear that inclusion of the next higher order terms would 
necessarily lead to improvement. Also, the calculation would increase in 
complexity. : 
Cochran has shown that a rather close relationship exists between th 
Fourier synthesis and least squares techniques. There are conditions on 


450 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


this relationship given by Cochran™ and Hoard and Geller’, and in 
addition in the actual least squares calculation, an approximation is 
made and nearness to linearity is assumed. Therefore, if the nonlinearity 
is not serious, convergence should be attainable in either case. If it is 
serious, the relationship could break down further and the Fourier 
synthesis could conceivably converge when the least squares calculation 
tends not to converge. 


VII. COMMENTS ON THE SINGLE-CRYSTAL AUTOMATIC DIFFRACTOMETER 


As mentioned earlier, the data used in this work were collected four 
years ago. Since that time only one or two attempts were made to use 
the instrument for other studies. These were unsuccessful because of 
difficulties which are probably surmountable, but require modification 
of the instrument. 

The present instrument puts a lower limit on the sample size. To keep 
the time for recording a layer within reasonable bounds and to prevent 
the instrument from reacting to background scattering, only intensities 
above a certain preset count energize the circuitry which sets the crystal 
back and shifts speed. To obtain satisfactory counting rates the use of 
large crystals is required. (The intensity is proportional to the number 
of unit cells irradiated.) However, to obtain adequate or meaningful 
intensities from highly absorbing materials one must have small crystals. 
In short, the instrument presently is suited mainly to crystals with low 
absorption and from which sizable cylindrical specimens can be made. 

The indexing of the reflections was a tedious process. The possibility 
of error, particularly at the high angles, was great, but the use of photo- 
graphs and cross examination of data helped prevent errors. An improve- 
ment on the Bond-Benedict automatic single-crystal diffractometer 
would be provision for foolproof pre-indexing of the reflections. 


VIII. SUMMARY 


Extensive application of the least squares refinement technique 
(through the use of the Busing-Levy IBM 704 program) to three- 
dimensional X-ray data from crystals of guanidinium aluminum sulfate 
hexahydrate indicated that although the structure as originally reported 
for the isostructural guanidinium gallium sulfate is essentially correct, 
an exact structure is unattainable from the present data by means of the 
least squares method of refinement. The numerous high correlations of 
pairs of parameters, apparently linked with the nature of the structure, 
appear to be a primary cause of prevention of convergence. 
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The course of the calculations has been outlined with special emphasis 
on some of the more obvious parameter interactions, but tables are given 
to enable the more interested reader to examine the results in somewhat 
greater detail. 

The work also further demonstrates the importance of the correlation 
matrix as a tool for establishing the existence or nonexistence of inter- 
dependence of structural parameters. 
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Discrimination Against Unwanted Orders 
in the Fabry-Perot Resonator 


By D. A. KLEINMAN and P. P. KISLIUK 
(Manuscript received September 20, 1961) 


It ts proposed here that the usual Fabry-Perot interferometer structure of 
the optical maser may be modified in a very simple way to provide discrimt- 
nation against unwanted orders. The modification ts an extra reflecting sur- 
face suitably positioned outside the maser which can greatly affect the losses 
of the various orders. A simple one-dimensional analysis is given for the 
effect, and numerical results are presented for a realistic case, showing that 
the effect can be large. It 1s concluded that this technique may be useful in 
preventing unwanted oscillations in the optical maser. 


I. INTRODUCTION 


The Fabry-Perot interferometer has recently become important as a 
resonant cavity for electromagnetic radiation at optical frequencies.'”"*"* 
The nature of the modes of such a cavity has been discussed by Schawlow 
and Townes’ and by Fox and Li.’ The modes may be specified by three 
quantum numbers, one of which is the familiar order number giving the 
separation of the plates in units of the half-wavelength. The other two 
quantum numbers specify the possible field configurations across the 
plates, which are essentially identical in each order. Fox and Li have 
investigated these configurations and the corresponding frequencies and 
losses for interferometers consisting of perfectly reflecting plates in air. In 
the usual laboratory interferometer the Fox and Li modes cannot be 
resolved because of insufficient reflectivity of the plates. Therefore the 
role played by these modes in optical masers is not settled. On the other 
hand, fine structure which could be due to various Fabry-Perot orders 
has been seen in the output of both the gas° and the ruby’ optical maser. 
It has been pointed out’® that the optical maser is inherently a multi- 
mode device, and that the excitation of many modes can lead to unde- 
sirable effects in the noise, stability, and ultimate usefulness of the de- 
vice. Therefore it is proposed here that it would be useful to discriminate 
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against many of the Fabry-Perot orders which can occur in the output 
by increasing their losses relative to other “preferred’’ modes. 

The Fabry-Perot orders present a problem only when the fluorescence 
emission of the maser covers a frequency band wider than (2ud)~ wave 
numbers, where u is the refractive index and d the separation of the 
plates. This is the case in the gas maser of Javan, Bennett, and Herriott* 
where (2ud)~' ~ 0.005 em™ and the doppler broadened Ne transition 
would be expected to have a width ~0.05. Also, in the ruby optical maser 
of Collins et al’, (Qud)~' ~ 0.1 while the fluoresence line width at room 
temperature ~10. The orders cannot be eliminated in these cases by 
shortening the maser and hence spreading the orders, because the gain 
would then be insufficient to produce oscillations.’ In ruby, however, the 
gain could be increased’ by more than an order of magnitude by cooling, 
so that the crystal could be shortened. At the same time, the cooling 
could decrease the line width by more than an order of magnitude," so 
that elimination of orders appears possible in ruby optical masers by cool- 
ing. These examples show the interrelation of gain, line width, and the oc- 
currence of Fabry-Perot orders in the optical maser output. 

The idea of using a Fabry-Perot interferometer to discriminate against 
unwanted orders in the optical maser has occurred to a number of 
people.” Indeed, if the external beam contains several orders, a Fabry- 
Perot etalon could be constructed which would transmit only one of 
them. This, of course, would not necessarily have any effect on the losses 
of the various modes in the maser. If the etalon were put in the internal 
beam, elementary considerations do not tell us what to expect for the 
relative losses of the modes. The structure to be proposed in the next 
section is equivalent to making the etalon one of the reflecting ends of 
the maser. It is believed that a detailed discussion of how discrimination 
comes about in such structures is given here for the first time. 


II, A MODIFIED INTERFEROMETER 


It is proposed that another reflecting plate parallel to the maser plates 
be provided outside the maser with a means for adjusting the separation 
of the new plate from the maser. This would produce a modified inter- 
ferometer having three essential optical surfaces with the active medium 
in the space between two of these surfaces. It is expected that the separa- 
tion of the third surface from the maser will be much less than the length 
of the maser. The purpose of the extra surface is to provide discrimina- 
tion between the Fabry-Perot orders of the original maser by making 
some orders very lossy compared to other orders. The losses may be due 
to scattering by inhomogeneities in the medium and irregularities on the 
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reflecting surfaces, absorption by processes other than the fluorescence 
process of the active medium, and transmission through the outer reflec- 
ting surfaces. For convenience of discussion, all losses may be ascribed to 
the last mechanism by assigning suitable effective reflectivities to the 
outer surfaces. In any case, it is clear that the proposal has meaning only 
when losses are taken into account, since the only other effect of the extra 
surface would be to shift the frequencies of the already existing orders 
by amounts less than (2ud)~' and to introduce new frequencies corre- 
sponding to the increased over-all length of the modified interferometer. 
Therefore the performance of the device cannot be deduced in an ele- 
mentary way by considering the two regions between the surfaces as two 
interferometers with the shorter preferentially selecting and rejecting 
certain orders of the longer. The truth is that the modified interferometer 
has more, not fewer, orders than the original maser, but unlike the latter 
the orders may have very different losses. 


YI. ANALYSIS 


For analysis it is convenient to consider the one dimensional problem 
shown schematically in Fig. 1. A medium of real dielectric constant 
¢ > 1 and real conductivity o occupies the region —a S z S a. For 
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Fig. 1— Schematic diagram of one-dimensional symmetric structure chosen 
for analysis of modified interferometer. The value of the constant A is not needed 
in the analysis. 
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{z| > ait is assumed that « = 1 ando = 0. Atz = +b are placed re- 
flecting surfaces having the reflectivity for amplitude 


rm=e”, (1) 
Since the phase angle of reflection is unimportant here, it has been as- 
sumed zero. For later use the quantity 
T = tanhf (2) 
will now be defined. The reflectivity of the surfaces at z = -ka is 
= (Ve — 1)/(We + 1). (3) 
From (1), (2) and (3) one can write 
T = (1-—7n)/(1 +7) 
I/Ve = (1 — ra)/(1 + 14). 


It is therefore possible in this example to consider arbitrary reflectivities 
at 2 = -+ta,-+-b by suitable choices for J and 1+/e in the range 0 to 1. 

The symmetry of Fig. 1 about a plane at z = 0 causes the field to be 
either even or odd with respect to reflection in this plane. The even solu- 
tions are shown by (+) and the odd solutions by (—) signs in Wig. 1. 
The propagation constants are given by 


ke = o/c 

k = koVell + i(4ro/ew)]' (5) 
koe + i(2ro/er/e) + +: 

The continuity of the field and its derivative at z = a gives the conditions 


k tan(ka) = —ko tan(kob — koa + af) (6) 


(4) 


for even modes, and 
ko tan(ka) = +k cot(kob — koa + 2) (7) 


for odd modes. These equations give, in general, complex eigenvalues for 
the angular frequency w. 

It is convenient to require that w be real and allow o to assume an ap- 
propriate negative value. Both w and o are determined by (6) or (7) for 
even or odd modes respectively. Physically this corresponds to supplying 
sufficient gain through the negative o to maintain steady oscillations at 
frequency w. The larger the value of —o the greater are the losses of the 
mode in question. Now let the dimensions be so chosen that 


n(b — a) = mav’e (8) 
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where m,n are positive integers. It is then possible to write 
ko(b — a) = mr + (m/n)A 
Vekva = nx + A. 
The conductivity to be determined is contained in the parameter 
x = tanh(2roa/cv/e). (10) 


In any practical case the ratio k/k» occurring in (6) and (7) can be con- 
sidered real, k/ky) ~ ~/e. The equations for the real frequency and con- 
ductivity then reduce to 


(9) 





m \1 + Wel'x 
tan A = —(tan~A)—V&% 11 
an (‘an ) ye (11) 
ee, 1 — Ve tan A tan (m/n)A (12) 
Ve — tan A tan (m/n)A 
for even modes, and to 
tan A = (cot m4) Me+ xP (13) 
n 1+ VexT 
Spa ve tan (m/n)A + tan A (14) 


tan (m/n)A + ~/e tan A 


for odd modes. When tan A is eliminated between (11) and (12) or be- 
tween (13) and (14) the same quadratic equation for X is obtained, 
namely 








x + 2x +1=0 (15) 
where 
peas + T? + (1 + eT?) tan? (m/n)A (16) 
P T/e(1 + tan” (m/n)A) . 
The solution of (15) which reduces properly as 7, ~ 0 (T' — 1) is 
x= —p+|(p— 1)'|. (17) 
When | x | < 1, this reduces to 
2 
carr TVe(1 + tan” (m/n)A) (18) 


e + tan? (m/n)A 


The most practical method of solution is to find the frequencies by neglec- 
ting 7x in (11) and (13), which gives 
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a/etan A = —tan(m/n)A (even) (19) 

tan A = ~/ecot(m/n)A (odd) (20) 

respectively. I‘rom these solutions, the values of tan(m/n)A can be sub- 
tituted into (17) or (18) to obtain x. 

From (15) and (16) it is seen that x depends upon A through 


tan’(m/n) A. As a result of the “tuning” condition (8), A = 0 is a solu- 
tion of (19); this is the “preferred” mode having the lowest loss 


Xmin = —T/Ve. (21) 


The largest losses belong to modes having tan*(n/n) A > 1. The solution 
(17) gives two results in the limit tan*(m/n)A — ©, depending on 
whether «7” < lor> 1 

Xmax = —TV/e (eT? < 1) (22) 
Xmax = —1/(TVe) (eT” > 1). 


Let the quantity 
R= X/ Xeain (23) 


be called the discrimination ratio; then Rmax = € or 1/T?, whichever is 
smaller. Therefore the extra reflecting surface should satisfy 


To = Ta (24) 


to achieve the maximum discrimination, but there is no advantage in 
making r, exceed ra . It should be noted that the practical approximations 
(19) and (20) are not valid if e7” > 1. 


IV. DISCUSSION 


The properties of the solutions are best discussed with the aid of an 
example. For simplicity, a case is chosen in which (19), (20) are valid. 
Let 


m/n = 6 
We = 10 (25) 
T = 0.02. 


The corresponding reflectivities for amplitude are rz = 0.82, 7 = 0.96. 
According to (21) the loss of the preferred mode A = 0 is measured by 


x(0) = Xmin = —O0.002. (26) 
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From (19) it is seen that A = 57/2 isa solution with tan*(m/n)A = © 
so that according to (22) 


x(51/2) = Xmax = —0.2. (27) 


The graphical solution of (19) and (20) is sketched in Fig. 2 with circles 
representing even solutions and squares odd solutions. The results are 
summarized in Table I up to A = 57/2 = 450°; the remaining roots in 
the fundamental period of 52 may be obtained from the symmetry about 
A = 57/2. The roots are alternately even and odd as shown in the second 
column, and tan(m/n)A in the fourth column rises monotonically from 
0 to © corresponding to increasing losses. The discrimination ratio R, 


- Vio TAN A/5 





Fig. 2 — Graphical representations of (19) and (20) for m/n =1/5, We = 10. 
Odd solutions are indicated by squares and even solutions by circles except at 
A = 57/2, where the intersection is at +o. 
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TaBLE I—SumMARY OF RESULTS FoR NUMERICAL EXAMPLE WITH 


m/n = 1/5, We = 10 








A type tanA tan A/5 R 
0 even 0 0 1 
88°10’ odd +31.46 +0.318 1.1 
175°58’ even —0.0705 +0.705 1.49 
262°34’ odd +7 .67 +1.304 2.66 
345°21’ even —0.261 +2.61 7.27 
412°38’ odd +1.31 +7.63 37.4 
450° even cs) co 100 














given in the fifth column, increases from 1 to 100. These results are fur- 
ther summarized in Fig. 3, where the spectrum just calculated is com- 
pared with that of the “original” interferometer having no surfaces at 
z = +b. The loss in the original interferometer is x = —1/-~/e = —0.1 
for all modes. The heights of the spectrum lines in Fig. 3 are proportional 
to 1/R to indicate the relative “Q’’ of each mode. The total number of 
frequencies in the fundamental period is twelve compared with ten in 
the original interferometer for the same period. This is exactly what one 
would expect, corresponding to the 20 per cent increase in optical length 
of the modified interferometer. Also as one would expect, the spacing of 
the preferred modes corresponds to the orders of an ordinary Fabry- 
Perot interferometer of spacing d = b — a. 

It will be seen in Fig. 3 that the three modes on either side of a pre- 
ferred mode have frequencies very close to modes of the original inter- 
ferometer at A = +2/2, +2, +37/2. The losses of these modes can be 


os er Fes ved FG Le [ss 


Fig. 3 — The calculated spectrum with the “Q”’ of each mode indicated by 
the height of the lines. Shown below for comparison is the spectrum of the “‘origi- 
nal”’ interferometer. 
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calculated to a good approximation from these values of A. In general 
the approximation is 


tan(m/n)A ~ tan[(m/n)N2/2] (28) 


where N = 0,1,2, --- but N <n/m. Using (28) the evaluation of (17) 
or (18) can then be carried out immediately without solving for all of 
the frequencies. This is very convenient since only the modes near the 
preferred mode are expected to be of interest. It will be noted that the 
extra modes introduced by the extra surface are among the lossy modes. 
The periodicity in the above example is a result of choosing n/m an 
integer. If n/m is chosen not an integer, the periodicity is destroyed, but 
A = 0 remains a preferred mode with minimum loss. Except for extra 
modes in the regions of high loss, the general effect of the extra surface is 
to impose a modulation of period (n/m)z on the original modes. It is of 
course not essential for the desired effect that this modulation have a 
period commensurate with the period of the orders of the original inter- 
ferometer. Greatest advantage in discrimination against unwanted 
Fabry-Perot orders is obtained by setting 


b— aw (2Avr)" (29) 


where Av is the half-width at half-maximum of the fluorescence emission. 


V. SUMMARY 


The theory of the orders of the modified interferometer has been 
treated in one dimension by considering the symmetrical structure of Fig. 1. 
The analysis clearly shows the nature and magnitude of the effects to be 
expected. These effects do not depend in any essential way upon the 
symmetry assumed for convenience in the analysis, and similar results 
would be expected for an unsymmetrical modified interferometer with 
only one extra reflecting surface. It is clear that details in the analysis 
could be generalized in various ways without changing the substance of 
the conclusions. The most important of these would be to allow arbitrary 
reflection and absorption at the interfaces at -+-a to represent the prop- 
erties of deposited metal layers. On the basis of what has been presented, 
however, it can be asserted that a third surface of suitable reflectivity 
and properly positioned can provide considerable discrimination between 
the orders of a Fabry-Perot interferometer. 
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The One-Sided Barrier Problem 


for Gaussian Noise 


By DAVID SLEPIAN 
(Manuscript received September 21, 1961) 


This paper is concerned with the probability, P[T ,r(7)], that a stationary 
Gaussian process with mean zero and covariance function r(7) be nonnega- 
tive throughout a given interval of duration T. Several strict upper and lower 
bounds for P are given, along with some comparison theorems that relate 
P’s for different covariance functions. Similar results are given for 
F[T,r(7)], the probability distribution for the interval between two successive 
zeros of the process. 


INTRODUCTION 


Let X(t) be a real continuous parameter Gaussian process, stationary 
and continuous in the mean. We shall assume throughout that 
EX(t) = 0 and shall write r(7) = EX(t)X(t + 7). We further assume 
throughout that we are dealing with a separable, measurable version of 
the process. 

Our main concern in this paper is the probability P[T,r(7)] that X(t) 
be nonnegative for 0 S ¢ S T. This quantity is of interest as a means of 
describing the duration of the excursions taken by the process from its 
mean. From P[7,r(7)], the distribution function F[\,r(7)] of the inter- 
val between successive zeros of the process can be determined by differ- 
entiation [see (19)]. This latter quantity is of importance in a variety of 
engineering applications of noise theory. 

Considerable effort has been directed in the past toward the numerical 
determination of F[A,r(r)] both theoretically’” and empirically.” 
These researches have resulted in various approximations for F'[A\,r(r)], 
but many of these are neither upper nor lower bounds for F, and exact 
circumstances under which they are good approximations are not clear. 
Generally speaking, they are good for small values of \ and become nuga- 
tory for sufficiently large \. There appears to be nothing rigorous in the 
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literature concerning the asymptotic behavior of I* for large \. (An ap- 
proximation method is given in Ref. 21.) 

In this paper we summarize some known results and present a number 
of new strict bounds and comparison theorems for P[T,r(7r)] and 
F[\,r(7)]. The most important of these are: Theorem 1, Section 1.3; 
Theorem 3, Section 1.4; and Theorem 10, Section 1.8. Theorems 12 and 
13 (Section 2.7) dealing with class 2 covariances (defined in Section 1.1), 
though of less importance for our goal, are perhaps of more than passing 
interest. These and other results presented shed some light on theoretical 
questions regarding P and fF. Their utility in numerically determining 
these quantities will be discussed elsewhere. 

The paper is divided into two parts: Part I presents definitions, results, 
and discussions; Part II contains the more technical aspects of proofs 
and other supportive material for Part I. 


PART I — DEFINITIONS, RESULTS AND DISCUSSIONS 


1.1 Preliminaries 


From its definition, it is clear that P[7,r(7)] is a nonincreasing function 
of T. It assumes the value 4 for 7’ = 0. It obeys the scaling laws 


P[T,Ar(7)] = PLPr(7)] (1) 
PIT,r(Ar)] = PAT r(7)] (2) 
A> 0. 


In asserting (2) for all \ > 0 we have assumed r(7) given for all +. 
This is a convention that will be adhered to throughout this paper. It is 
to be noted, however, that P|7,r(7)]for0 S$ 7 S T, depends only on the 
“piece” of the covariance function r(7),0 S 7 S 7. 

The scaling law (1) suggests normalizing the covariances to be con- 
sidered so that 


r(0) = 1. (3) 


We adopt this convention hereafter. 

The scaling law (2) suggests that a normalization of the time scale is 
in order. There does not appear to be a convenient way to do this for the 
class of all covariances. For processes continuous in the mean, such as 
are being considered here, all one can say in general about covariances is 
that they are even continuous nonnegative-definite functions. This is a 
rather large class of functions containing a great variety of pathologies 
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such as nowhere differentiable continuous functions. In what follows we 
shall have occasion to consider covariances r(7), strictly monotone in 
some right neighborhood of 7 = 0 and such that r(7) — 1 behaves like a 
nonnegative power of | 7 | for sufficiently small | 7 |. We normalize and 
define this class as follows: The continuous covariance r(7r) ts said to be 
of class a if, as r approaches zero, 


Kal 
Tia + 1) 


and if r(r) ts strictly monotone in some right neighborhood 0 < + < 7. 
of the origin. Here necessarily O S$ a S 2 and I'(a + 1) is the usual 
gamma function. The normalization is contained in the specific choice 
of the coefficient of | 7 |*. 

To the author’s knowledge, when the scaling laws (1) and (2) are 
taken into account, there are only three distinct covariances for which 
P[T,r(7)] is known explicitly. These are: 


rr) = 1 - + o(|7|*), 


Gy. wiry eo", 0OS7rS oa, 
P[Tri(r)] = 2 aresin e", OST <2; 
T 
(72) r(Br) = 1 — + Bt cos (2), Sa sm, 05681, 





P(T re(B,7)] = 


< 


(17t) 1r3(7) 


I 
oor 
— 
| 
bes | 
S ead 
4 


1 
Let 4 
P(T,rs(7)] = it 5, laresin(1 —T)-~V/T(2—T)l, 0ST S81. 


The process with covariance r;(7) is Markovian, and it is this special 
property that permits determination of P[7',r:(7)] in this case (see Ref. 
22 or Ref. 21, Section IX). 

Case (22) corresponds to the stochastic process 


X(t) = 4+ Beos|4 +a, 


with A, B and ® independent random variables, the two former being 
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normal with mean zero and variances 1 — @’ and @’ respectively, and the 
latter being distributed uniformly in (0,27). The determination of P 
in this case is an exercise in integration and elementary probability the- 
ory that will be omitted here. lor the obvious generalization of this case, 
namely, 


X(t) =A pe B; cos{t/8; + ®,], 


P[T,r(7)] can be expressed in principle as a (2N + 1)-fold integral. Ex- 
cept in the case N = 1 presented, the integrals appear untractable. 

The form for P[7,r3(7)] given follows from results found in Ref. 23. 
Note that it is valid only for T S 1. We have been unable to extend P 
beyond this point. 

These examples shed little light on the many questions that naturally 
arise concerning the behavior of P[T,r(7)], both as a function of 7’ and 
as a functional of r(7). What are possible asymptotic behaviors of P 
for large T7? What features of r(7r) determine this behavior? To what 
extent is P determined by the behavior of r(7) in the neighborhood of 
7 = 0? (For example, if r(7) is analytic in the neighborhood of 7 = 0, 
then it can be extended as a covariance in only one way, namely, by its 
analytic continuation. In this case, then, P[T,r(7)] is completely deter- 
mined by the behavior of r(7) near 7 = 0.) If g(r) is another covariance, 
in some sense close to r(r) for 0 S 7 S T, is P[T,r(7)] close in some 
sense to P[T,¢(7)]? How can P[T,r(7)] be determined numerically for a 
given covariance r(r)? 

These and many other basic questions await to be answered in full. 


1.2 P[T,r(r)] as a Limit 


LetO=4<h& <--- <t, = T bea partition of the interval (0,7) 
into n — 1 parts. The n random variables X(t,), X(te), --- , X(tn) are 
jointly Gaussian with covariance matrix r = (r;;), Where ri; = r(ti — t;). 
Denote by P,,(r) the probability that these n random variables be non- 
negative. Because of the assumed separability of the process, 


P(T,r(7r)] = lim P,(r), (4) 


where it is understood that the limit is taken as the partition is refined 
with mesh tending to zero. If r(7) is positive definite, then |r| ~ 0 for 
any choice of partition, and one can write explicitly 


Pain |e I dx, +--+ I dee Te, (5) 
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It is somewhat surprising that information about P[T,r(7)] is so difficult 
to obtain when it can be expressed as the limit of the apparently not too 
unwieldy expression on the right of (5). This integral is deceptive. For 
n > 3 it cannot be expressed in terms of elementary functions of the co- 
variance elements 7;; . Series expansions and upper and lower bounds can 
be easily written for this integral, but most of the obvious ones yield 
vacuous results in the limit as the partition is refined. 

The integral (5) admits of a simple geometric interpretation obtained 
by reducing the quadratic form in the exponent to a sum of squares by 
a linear transformation and performing a radial integration. P,(r) is the 
fraction of the unit sphere in Euclidean n-space cut out by n-hyperplanes 
through the center of the sphere. The angle 6;; between the normals to 
the 7th and jth hyperplanes directed into the cutout region is given by 


COS 6:3; = ij, 1,7 = 1,2, ---, n. This geometric interpretation of P,,(r) 
holds even when |r| = 0. For n = 2 and 8, this picture gives at once 
P, = 5 ln — oul = f+ pc aresin re (6) 


1 
Ps = 7 Lr — On — Os — Oral 
ty 


(7) 


1 1 . . : 
; + - laresin rig + aresin 713 + arcsin roy]. 
vis 


Seen on the surface of the sphere, the region described above is the 
generalization of the spherical triangle in three-space and is known as an 
n-dimensional spherical simplex. Geometers have studied the problem 
of expressing the content of the spherical simplex in terms of the angles 
between its bounding surfaces.*** Many of their results can be readily 
derived from known results in probability theory using the connection 
with P,(r) Just mentioned (see Section 2.1). 

It is clear that P,(r) is an upper bound for P[T,r(r)]. The result (7) 
then is a simple upper bound for P[T,r(7)], where ry. = r(te — ¢,), 
3 = T(t; — hh), 723 = r(ts — tf) and t,, &, ts are any three points in the 
interval (0,7'). For very small values of 7’, this upper bound can be made 
close to the true value of P[7,r(7r)]. For large values of 7’, this is gen- 
erally not the case. If, for example, r(7) is never negative, P; is always 
greater than §. If r(7) oscillates in sign, there is a minimum value for P; 
different from zero (unless r(7) achieves the value —1) obtainable for 
any choice of t; S é& S ts, and hence this bound for P[T,r(7)] does not 
approach zero for large T’. 


468 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


1.3 A Comparison Theorem for P[T,r(r)| 


Recall that in the geometric picture of P,(r), ri; = cos 0:; where 6;; is 
the angle between the inward normals to hyperplanes 7 and 7. Intuitively, 
it is clear that if this angle is decreased, i.e., if 7:; 1s increased, P,,(r) 
should also increase. This is borne out by the following 

Lemma 1} — Let P,,(r) be the probability that n jointly Gaussian vari- 
ates with mean zero and normalized covariance matrix r(ris = 1) be non- 
negative. Let q be another normalized n X n covariance matrix. If rij = Qi; 
for i,j = 1,2,--- ,n, then P,(r) = Pr(q). 

Note that the matrices r and q need only be nonnegative definite (as 
distinguished from positive definite). 

By regarding P[T,r(7)] as a limit of P,(r), as explained in the pre- 
ceding section, Lemma 1 can be used to deduce the following comparison 
theorem. 

Theorem 1— If r(r) 2 q(r) forO S$ + S T., then P[T,r(7r)] = 
P{T,q(7)| forO0 S T S T.. 

The covariance function of a process is generally regarded as a rough 
measure of how much the process “hangs together.”’ This view is sup- 
ported by the theorem. A process with a larger covariance function 
hangs together more and is more likely to maintain the same sign than 
one with a smaller covariance. 

The comparison theorem can be used with the three covariances 
(Section 1.1) for which P[7',r(7)] is known exactly to bound this quan- 
tity for other covariances. The theorem is particularly useful for com- 
paring covariances of the same class. Let r(z) and g(7) both be of class 
a, and suppose that r(r) = g(r) in some neighborhood of the origin. 
Then P[T,r(r)] = P[T,¢(r)] in this neighborhood. But, for any \ > 
1,qg(r) = r(Ar) in some sufficiently small neighborhood of the origin, 
so that also P[T,q(7)] = P[T,r(Ar)] = P[AT,r(r)] by the scaling law 
(2). Choosing \ appropriately leads to the following 

Theorem 2 — Let r(r) and q(r) be of class a with r(r) = q( 7) in some 
neighborhood of + = 0. Then for some T* > 0, 


PIT r(r)] = PIT,g(7)] 2 Pl “(q(T)),7(7)], OS T Ss T*. 


The theorem is proved in Section 2.3 where the determination of 7* 
and the choice of proper branch for r'(q) are also discussed. Knowledge 
of P[T,r(7)] thus provides both upper and lower bounds for P[7,q(7) ] 
near 7 = 0. . | 

t Proved in Section 2.2. A special case of this lemma was proved by J. Chovert 


by a completely different method. He applied his result to obtain a weak version 
of our Theorem 1. Chover’s result inspired much of the present paper. 
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1.4 Some Related Results Useful for Large T 


From Lemma 1, it is easy to deduce (see Section 2.4) 
Theorem 3 — Let T,; 2 0,T2 = 0,73 2 O be such that T; + T. = 73. 


Ifr(r) = Ofor0 S + S 73, then 
PTs, r(7)) = P[Ts, rr) PZ 2 , r(7)]. (8) 


This theorem provides some asymptotic information on P[7,r(7)] for 
covariances that are never negative. It implies for these covariances 
that — (1/7) log P[7,r(7)] approaches a nonnegative limit as 7’ becomes 
infinite. In this sense, then, for nonnegative covariances, P|T,r(7)] cannot 
decrease asymptotically more rapidly than exponentially. An exponential 
lower bound for these covariances is found by iterating (8). Thus, if 
T = NT,, P[T,r(r)] = PINT., 7r(7)| = PIT. , r(7)]”. One obtains in 
this manner the exponential bound 


PiTr(7)| = P.P™ 3 =T = T, | (9) 


which holds for nonnegative r(7) with P,'= P[T,, r(7)], T. > 0. 

For covariances for which P[T,r(z)] is not known, (9) still gives useful 
information by replacing P, by a lower bound. For example, from the 
lower bounds presented below Theorem 6 in Section 1.6, it follows that 
for nonnegative r(7) of class 2, P[T,r(7)] 2 f(T) where 


df T 
s[1-Z], : 


1{/3 T rg 
ila- =) aa 


1/T 


T 


IIA 
IIA 


vo| 3? Na 


f(T) = (10) 


IIA 
IIA 


By choosing 7, to maximize f(7,)""’ and using this maximum value for 
P, in (9), one obtains the following 


Lower Bound — If r(r) ts of class 2 and nonnegative, then 
P[T,r(r)] = 0.121 eP°8 "FS (1.016)z. 


For a specific nonnegative covariance of class 2, a somewhat smaller 
exponent can often be obtained by using for f the lower bound of Theo- 
rem 6, or a lower bound obtained from. the comparison theorem and 
example (72) of Section 1.1. 

For covariances (such as r3(7) of Section 1.1) that are identically zero 
for 7 2 JT, for some 7, > O, an exponential upper bound can readily be 
written for P[T,r(7)]. For example, if 7 = (2N — 1)7,, then 
P{(2N — 1)T7T,, 7r(7)] is certainly not greater than the probability that 
the process be nonnegative in the intervals (0,71), (271, 371), --° , 
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((2N — 2)7,, (2N — 1)T,). But the process in any one of these inter- 
vals is independent of the process in the other intervals because of the 
vanishing of r(r) for 7 = 7,. Thus, P[T,r(7)] S {PIT,, r(7)}”. 
Arguing in this manner, one arrives at the 

Upper Bound — If r(r) vanishes for r = T,, then 


PiT r(r)] < SS 7 po, coat 
where P, = P[T,, r(7)]. 


1.5 Bounds from Rice’s Series 


LettO =4h <b <--- <t, = T bea partition of the interval (0,7) 
into n — 1 parts. Let A; denote the event: “X(t) changes sign at least 
once in the interval ¢; S t < ti,” 7 = 1,2,---,n — 1. Then, by the 
method of inclusion and exclusion, 


2P[T,r(r)] = 1 — Do Pr{Ajg + Do Pr{4.N AQ 


t<7 


_y Pr{A; nN Aj N A,} 


bees bh (—1)" PrfA, M1 Ag N +++ N Anal, 


is the probability that none of the events A; occur. If r”(0) exists, the 
- above series approaches as a limit as the partition is refined with mesh 
tending to zero 


T 1 T T 
2P[T,r(7)] =l1- I qi(ts) dt, + 51 I at, f dtegqo( ty , te) — te 
(compare Rice,” Equation 3.4-11) which we write as 


. (—1)"B, 
1+ n! : 


T T 
/ di --- if dtndn(ti, +++, tn). 
0 0 


Here gn(ti , +++ ,fndt,:++dt, is the probability that X(t) has one or more 
zeros in each of the intervals (t; , 4: + dt,),---,(t. + dt,). The existence 
of r”(0) assures us that X(t) has a derivative almost everywhere in 
(0,7') for almost all sample functions. One then has 


dats, oe yte) =f dese ff df |e b] 


2PIT r(7)] 
(11) 
B,, 


(12) 
‘[p(é., ee eg fn »>t1,°°", ae lesen: 
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Here p(&, --+,€n 21, °**,2n) is the joint density for the random vari- 
ables X’(t),---,X’(tr),X(h),--+,X (tn) with & associated with X’(t;) 
and x; associated with X(¢;),7 = 1,2,---,n. X’(¢) is the derivative of 
X(t) with respect to t. 

From the derivation of the method of inclusion and exclusion, suc- 
cessive partial sums of (11) alternately overestimate and underesti- 
mate 2P(7'). We therefore have the sequence of bounds 


0 < 2P[Tr(7)] S$ 1, 








Be shah IB a 
By, Bz Bs By Bs B; B, 
A — a tay ay SPIO Sl - ta arta 


etc. Unfortunately, except for mn = 1,2,3, the integrand gn(t, ---,tn) 
occurring in the definition of B, cannot be expressed in terms of ele- 
mentary functions. For covariances r(7) of class 2, one has 


1 
g(a) = -, 
T 
1 peD/1 — o + @ aresin al 
go( tr , i) = - a aa 
7 (1 — a?)3? 
where 
w= (1—Pr)(1—7”) -— r°(2 + 2rr” — rv”), 
a=[(l—r)r’ +r Vil -—r — rl, 
and 


r= r(t — i), r = P(t — th), r” = r"(te — th). 


The expression for g3 is too complicated to warrant display here. 
Bounds given by partial sums such as (13) cannot be expected to 
yield useful results for large T. Typically, for large 7, B, behaves like 
T”: the upper bounds exceed unity for large 7 and the lower bounds 
become negative. 
For small 7’, however, (13) yields useful information. One has 


Ifr(r) = 1 — 7°/2 + cr'/4! + O(7°), a very tedious computation shows 
that for small 7’, 


472 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


~e-17 3 
B, = 24 ely, 


O(T*). 


Irom this and the inequalities (13) follows 
Theorem 4 — If for small +r 





Bs 


I 


2 4° 
ate) ha eg Ole), 


then the first three right-hand derivatives of P|T,r(7)| with respect to T 
exist at T = 0 and are given by 


P[O,r(7)) = 4, 


PlOt ns) = - x, 

= 3 Tv 
P"0+,r(7)] = 0, 

ml | a 3¢—1 
P''[04+,r(r)] = 548, ° 


The assumed form for r(7) in Theorem 4 is important. It has been 
shown by Longuet-Higgins™ that if r(7) = 1 — 7°/2+6|7)|*'+ O(-'), 
b ¥ 0, then for small 7, B, = O(T?) for n = 2,3,4,---. One can only 
conclude in this case that P’[0+,r(7)] = — 1/2z. 

The power series 1 + > 1° B,A"/n! can be written formally as 


exp y Crd" /N. 
1 


Expand the latter in a power series, equate coefficients of like powers 


of X and set \ = —1. There results the formal identity using (11) 
2P(T,r(7)| = eo cutenla—ealtte es gis 
where 
q = B= L 
Tv 
Co = By — B: | ax 


B; — 3B,B, + 2B; 


C3 
C= B, — 4B,B; — 3B, -+- 12B;B. aa 6By, 


etc., with the B’s given by (11) and (12). Relations (15) are the usual 
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ones connecting semi-invariants with central moments (see Ref. 39, p. 
37 or Ref. 40, p. 186). Kuznetsov, Stratonovich and Tikhonov” have 
suggested the use of (14) keeping a finite number of c’s as a better ap- 
proximation to P than series (11). For large 7, (14) will perhaps yield 
a better approximation than (11), but it is difficult to see under just 
what circumstances this will be true. A knowledge of the asymptotic 
behavior of the c’s for large T is needed, but this appears to be a difficult 
point. 

A truncated form of (14) will not in general yield the correct j asymp- 
totic behavior of P[7,r(r)]. For example, retaining only c,, (14) gives 
2 P[T,r(r)| ~ €" for all class 2 covariances. That this is not in general 

correct can be seen from a family of simple counterexamples. If g(7) is 
of class 2, then so is 





“r*(r) = qlar) a (16) 


where a = +/1 — 6/3 and0 < B< we If X(t) has covariance r*(7r), 
then since r*(nz/8) = 0yn = +1,+2,---, the random variables 


X (1/8), X (27/8), X(31/B),- : 
are independent. Set N = [67'/z]. Then 
P[Pr*(7)) S Pr{X(jr/B) 2 0,7 = 1,---,N} = (3)" 


< a(2)etir = Qe 8 log 2)T/r 
= “\2 eo . 
Thus if 
J/3 = 1.732 = > +, = 1.442, (17) 
log 2 


e'"P[Tr*(r)| approaches zero exponentially for large 7', and the first 
term in the exponent of (14) yields an incorrect asymptotic behavior. 
It is interesting to note that the form e ”” obtained from (14) by 
retaining only c, would be correct for a process in which the axis cross- 
ings were independent. One would then have gqa(fi, -++,f.) = [[ q(t), 
B, = (B,)" and c, = 0,n > 1. For processes with the covariance (16) 
with 6 given by (17), P[T,r*(7)] decays even more rapidly. This has 
nothing to do with the asymptotic behavior of r*: by proper choice of 
q(r), this can be altered at will. One must suppose this rapid decay of 
P[T ,r*(7)] is due to the fact that typically r*(7) takes negative values 
so that at certain time separations the process is anticorrelated. Indeed, 
it is tempting to conjecture that for nonnegative class 2 covariances, 
e’'"P[T,r(r)] increases without limit for large T. 
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1.6 Some Other Bounds for P(T,r(r)| 


In this section we list a few miscellaneous bounds on P[T7’,r(7)]. 
Theorem 5 — 
9 1 
P(T r(7)] Ss . i (1 — wu) aresin r(Tu) du. 
0 


The theorem is proved in Section 2.5. If + arcsin r(7) is integrable, the 
bound in Theorem 5 approaches zero like 1/7. 

Lower bounds for P[T,r(7)] are difficult to obtain. One is given by 
(see Section 2.6) 


Theorem 6 — If r(r) ts of class 2, 


3. Tf 1 : 
Py ane Cine 
P(T,r(7)] 2 ge + q_ aresin r(T), 
This bound goes negative for relatively small values of 7 (at least be- 


fore T = 2:7). It gives somewhat more information than the bound 
Pity) = 3[1 - 2], (18) 
2 T 


obtained from Rice’s series (Section 1.5) by retaining only B,. The 
bound obtained by retaining B,, B, and B; is of course generally much 
better than that of Theorem 6 but is so complicated that it can be used 
only with difficulty even with a modern computer. For nonnegative 
covariances of class 2, Theorem 6 gives P[7',r(7r)| 2 %& — T/4n. This, 
together with (18), gives (10). 


Theorem 7 — If in the neighborhood of r = 0, 


r 17 4 
| Oe a gy OKT) 
then 
de 22? 1 ‘ rae 4 
T <=—— —— are —— as 7 
P[T,r(7)] S 5 dp 9, aresin E sin (Z) |. OSTST,, 


where T; = min(8r,7.) and 7, ts the smallest positive value of + for which 
r(r) = 1 — 2/8. This theorem follows from the comparison Theorem 
1, the result (27) of Section 1.1 and the fact (see Theorem 14, p. 494), 
that for 0 S + S T,, the covariance of Theorem 7 is dominated by 


7(B,7). 
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Theorem 8 — If r(r) is nonnegative and of class 2, then 
P(T,r(7)] = ; — 7 _ 57 aresin IS sin il: 0O<T< a 
This theorem follows from the comparison Theorem 1, the result (22) 
of Section | and the fact (see Theorem 13 in Section 2.7) that for 0 S$ 
rt S n/v/2, every nonnegative covariance of class 2 is greater than 
ro(1/+/2,7). 

We conclude this section with a rather weak, but sometimes useful, 
result proved in Section 2.8. 

Theorem 9 — Let h(£) be nonnegative for 0 S — S O and let h(E) = O 
foré < Oandé > 6. Define 


Gola) = [Ma + OAC) ae 


and set 


r(r) = f “rs LEC is 


Then 
P[T,re(7)] 2 P[T + 4,r(7)]. 


1.7 Relationship Between P({T ,r(r)| and F[\,r(r)| 


If r”(0) exists, then almost all sample functions X(t) possess a deriva- 
tive almost everywhere. If r”(0) does not exist, then almost all sample 
functions are nowhere differentiable. In this latter case, if a realization 
X(t) has a zero at t = 0, it almost certainly has infinitely many zeros 
in every right neighborhood of ¢ = 0. In discussing F[\,r(7)], the 
distribution of the interval, 1, between successive zeros of X(t), we ac- 
cordingly restrict our attention to covariances for which r”(0) exists. 

The quantity P[T,r(7)] — P[T + A,r(7)] is the measure of those 
sample functions which are nonnegative in (0,7') but are not nonnega- 
tive in (—A,O), ie., the measure of those sample functions that are 
nonnegative in (0,7') and have at least one axis crossing in (—A,Q). 
Divide this quantity by the probability vA + o(A) that X(t) have one 
or more upward axis crossings in (—A,0) and allow A to approach zero. 
There results 


bea 
v aT 
Here Q[7,r(7)] is the conditional probability that X(t) be nonnegative 


QT r(7)] = — PIT r(r)] = 1 — FIT, r(7)1. (19) 


476 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


in (0,7) given an upcrossing of the axis at t = 0; F[\,r(7)] = Pr(l SX) 
is the distribution function for the interval / between zeros. One should 
note carefully that the condition in the definition of Q is in the “hori- 
zontal window sense”’ (see Ref. 10, Section 2 for a more complete dis- 
cussion of this term). We shall find Q[7,r(7)] more convenient to deal 
with than F[T,r(7)]. 

From its definition, Q|7,r(7)] is nonincreasing as a function of 7. It 
assumes the value 1 for 7 = 0. Like P[T,r(7)], it satisfies the scaling 
laws 


QIZ,Ar(r)] = QIZ,r(7)] 
QIT,r(Ar)] = QAT,r(7)] (20) 
A> 0. 


For most purposes, then, it suffices to consider only class 2 covariances. 


In this case (see Ref. 19, Equation (3.8-10)) v = > and (19) becomes 
Tv 


QL r(r)] = —2e 5 PUL (=)] (21) 
Clearly upper and lower bounds on Q[7,,r(7)], say 
QolT r(7)] 2 QT r(7)], OSTET. 
QiIT ,r(7)] S QIT,r(7)], Os 1 Ss Los 
furnish bounds on P[T,r(7)] by integration: 


= = [ Q:l2z,r(7)] da, 


0S TST). 


: = 5 i Gerla = Pree ; 


However, since Q is nonincreasing, it is also possible to obtain weak 
bounds on Q from known bounds on P. For example, since Q is non- 
increasing, if b > a 2 0, 


(b — a)Qlajr(r)] =f Qlyr(r)] dy & (b — a) Qlb (ry), 


or from (21) 


Plar(7)] — Plb,r(7)] 


Qlar(r)] 2 2a a 


= Qb,r(7)]. (22) 
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Thus if Py(7') and P,( 7) are respectively upper and lower bounds for 
P[T,r(7)] valid for all 7, 


max ag Ph!) = Pols) 


£=>T x—T 


S QT ,r(r)] 
P (x) P (T) (23) 
O<2esT T—2 


Note that the left inequality of (22) fora = 0, b = T again gives (18). 
Also from (21) and the fact that Q is nonincreasing, it follows that 
P[T,r(7)] for class 2 covariances must be convex downward. 

To the author’s knowledge, when the scaling laws (20) are taken into 
account, the only covariance for which Q[7,r(7)] is known explicitly 
IS r2(8,7) of (227), Section 1.1. One has 


r2(B,r) = 1 — 6 + pteos (2), 0<p6 81, 


COs (5) 
Lee fe 
{/ 1 — @ sin? (55) 


T 
0, 2h SS SO; 
B 


IA 


2, 


bo] 


QIT,r2(7)] 


| 


1.8 A Comparison Theorem for Q[T,r(r)| 


Imposing the condition that X(t) have an upcrossing at t = 0 in the 
horizontal window sense greatly complicates computation of probabil- 
ities associated with the process. For instance, when X(¢) is conditioned 
in this manner, the random variables X(t,),X(to),---,X(ti) are no 
longer jointly Gaussian. If r(7) is of class 2, their joint density is 


Qn l dé tp(&, 0,21, °*>jtm)eymas 


where p(é, 2,21, °°+,Un) is the Gaussian density of the unconditioned 
variables X’(0), X(0), X(t),---,X (tn). 

It is possible, nevertheless, to derive a comparison theorem for 
Q(T r(7)] and Q[T,q(7)] for class 2 covariances somewhat in the spirit of 
Theorem 1. (See Section 2.9 for proof.) The function g(é) = q Ir(d)] 
plays a role here. Writing 7 = g(t), then g(r) = r(t). Fora given value 
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of t, we choose g(é) as the smallest positive value of 7 for which q(7) = 
r(t). At t = 0, we have + = O. As ¢ increases from 0, so does 7. One of 
two difficulties can occur as ¢ increases: r(é) may reach a local minimum 
r(t.) at £ = t, before g(r) has reached its first local minimum, say q(71); 
7 may assume the value 7; when t assumes the value 4; S ¢,. In the 
former case we define g(t) only for 0 S t < ¢,; in the latter case, we 
define g(é) only for 0 < t S &. The comparison theorem can now be 
stated as follows: 


Theorem 10 — Let r(r) and q(r) be of class 2 and let g(t) = q'[r(t)] 
be defined as above. If for all nonnegative x and ywitthx+y SS T., 
g(x) + gly) 2 g(x + y), (24) 
thenforO ST ST, 
QIT,r(7)] = Qlg(T),¢(7)].- (25) 


It is easy to show that if r(z) 2 q(7) in some neighborhood of the origin, 
then g(t) has the subadditive property (24) in some sufficiently small 
neighborhood of the origin so that the theorem is not vacuous. 

The steps which led from Theorem 1 to Theorems 2 and 3 are no 
longer valid when X(t) is conditioned to have an upcrossing at ¢ = 0. 
We have found no analogue of these theorems for Q[7',r(7)]. 

By using (21), one can integrate the inequality (25) to obtain a more 
complicated comparison theorem for P[T7’,r(7)], namely 


go(T) 
PIT p(r)] > 4+ i w'(e) 5, Pléa(7) dé = Plg(T),q(r)\/9'(T) 


g(T) - 
_ [ Plé,g(r)]h” (&) dé, 
valid for0 S$ T S T,. Here h(é) = g'(&) = r'[q(é)]. 
PART II — PROOFS AND SUPPORTIVE MATERIAL 


2.1 The Geometric Approach to P, 


We wish to consider the probability P,(r) that n jointly normal 
variates, each with mean zero and normalized covariance matrix r, be 
nonnegative. Throughout this section we assume that r is nonsingular. 
Then P,,(r) can be written as in (5). Denote the eigenvalues and nor- 

+ The material in this section was developed in 1952. Many of the results have 


been obtained independently by other workers and have been reported in the 
literature. Cf. Plackett*! in particular. 
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malized eigenvectors of r by \; and uy = (¥i' lo’, + -,Wn'),¢ = 1,2,---,n 
One has 


yD rinWe! = Aiwe, 
pa ibe = x vids = 5:3, 
riz = 2 Lehi Ws, 


(26) 


4,J = 1523° “yn 
In (5) make the substitution 2; = >>, ¥’./riyx . There results 


Par) = (Qn)" [ ese fay a neta dyn tr 
R 


where the region F is defined by 
= » Vide 2 0, 1= 1,2,° myn. 
k 


Denote by A, the (n — 1)-dimensional content of the intersection of 
this region with the surface of the unit sphere having center at the origin. 
Then, by changing to a spherical coordinate system, 
Pes (an) "A, | drr teh = Bs 
0 Sn 

where S, = 22”/I'(n/2) is the area of the unit sphere. Thus, P,, is the 
fraction of the unit sphere on the positive side of the n By pereinnes 
H; = 0. The unit normal a’ to H; directed into R has components ax = 
vi *a/X,. F rom the last of (26), we find for the angle 6;; between a‘ and 
a’, COS 0;; = a’-a’ = 73;. 

‘As mentioned in Section 1.2, expressions for the content A, of the 
spherical simplex in terms of the angles between its bounding surfaces 
are not known for n > 3. However, for the determination of P[7,r(7)] 
one is concerned with the limit as n — « of P, where the angles 6,;°” 
are given, for example, by cos 6;; = r[(¢ — j)T/n] with r(r) a given 
positive definite function. Thus, sufficiently tight bounds for P, might 
in the limit yield useful results concerning P[T,r(7)]. The geometric 
picture suggests a large number of such bounds. Unfortunately, none 
has been found which yields useful limits. Since, however, approxima- 
tions for the n-variable normal integral P, are of interest in their own 
right, we digress to mention several such bounds which may be useful. 
(See Ref. 42 for a bibliography on the multivariate normal integral.) 
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Circular cones with vertices at the origin can be inscribed and cir- 
cumscribed about the region R. The half-angle of the inscribed cone is 
found to be given by 


sin 6; = ———— 


pa (27) 

ij 

and the half-angle of the circumscribed cone is given by 

a rene 
Drive (28) 





cos 6, = 





The fraction of the unit sphere cut out by a circular cone of half-angle 
6 is 


r(0) = 4. () 


6 
/ Sr | dy sin” “¢ = 31 inte (254 ) 7 (29) 
T a 
rH) 





2 


where I is Pearson’s incomplete beta function.” One has 
F,(0:) S Pa S Fn( 6). (30) 


Bounds for P,, can also be written in terms of inscribed and cireum- 
scribed Euclidean simplexes. The planes H; = 0 intersect n — 1 ata 
time in lines which pass through the origin and a vertex of the spherical 
simplex. Let b’ denote the unit vector from the origin to the vertex not 
contained in H; = 0. One finds for the components 6,’ = ¥i*(daru) 
and for the content of the Euclidean simplex determined by the origin 
and the end points of the b’, 


1 
G,, Sa —— es. 
n! V/| r | V Ilr; 


This simplex lies within the region of interest. The hyperplane through 
the end points of the vectors b‘ sec 6, is tangent to the unit sphere. The 
Euclidean simplex determined by the origin and the ends of these vec- 
tors therefore contains the region of interest. Thus, 
G sec” 0.G 
— = P, = —>—,, 
Va. we ee 


where V, = 7”?/I'(n/2 + 1) is the content of the unit sphere, 6, is 


(31) 


(32) 
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given by (28) and Gj'by (31). Incidentally, for the cosines of the angles 
between the b’s one finds the interesting reciprocal relations 


—l —1 
Vij Sij 


which is the natural generalization of the usual relationship between the 
sides and angles of a spherical triangle in three-space. . 

One can expect the bounds in (30) to be close to each other when the 
b’ are nearly coplanar, e.g., when all the entries of r are near unity. One 
ean expect the bounds in (32) to be close to each other when the b’ are 
nearly codirectional, e.g., when all the entries of r’ are nearly equal. 

An important differential recursion relation first derived by Schlafli*® 
for the content of the spherical simplex can be obtained in an analytic 
manner from the expression (5) for P,,. We write 


sj = b’-b’ = 


P,(r) = I day eee [ denGn(a1, os , Eni0) (33) 


where the n-variate Gaussian density is given in terms of its character- 
istic function by 


Jn( 21, eae Unit) = i: dé eee i: dé, ia hs Pa el 
From this latter expression it follows that 


O9n d°Gn : 
= k>j. 34 
OT jt Ox j0X;, : : ( ) 








Here we regard g, as a function of the n(n — 1)/2 variables ry, , k > J, 
and recall that ri; =1, rjx = rxj . Regarding P, as a function of this same 
set of variables, we find from (33) and (34) 


aP,,(r) f a a 
= dai-*: Xn ——— Jn( U1, °° +, Un50). 
Orie 0 sss 0 . arte” ( : ; ) 


Perform the integrations indicated on x, and 2, . There results 





aP,(r) = | dix3 +> | dxnGgn(0,0,03, °° *,tnjt) 2 O. (35) 
Ori2 0 0 
Now if g, is the density for the random variables X,, ---,X,., 
Jn (X1 2 7° Un x) = p(x ) L2)P( Xs yt tyr U1, to); 


where p(21, ,v2) is the joint density for X; and X2 and 


p(x, ++ Wn | @1, Xe) 
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is the conditional density of X3, ---,X, given that X,; = 2, and X_; = 
x2. In the case of Gaussian variates, these densities are well known 
Evaluating this expression at 21 = X2 = 0, one finds 


i 


i 0,0,a ee  Uni¥) = o_O n~ (x °° *ytnsP. ), 
Gn( 3 a eae 2\X3 12 


When combined with (35) and generalized for arbitrary indices, this 
yields 
dP,(r) 1 
OP 5K ary 1- Pigs 
Here r- ;, is the customary notation of the statistician for partial corre- 


lation coefficients (see Ref. 40, Section 23.4 and pp. 318-319), so that, 
for example with wp # j,k, v ¥ j,k 


P,-2(r.j3x) 2 0. (36) 


uv Tuy Tuk | 
L jv 1 l jk | 
Pie ee : 
] Tui = Vpk ] Tri = Vuk 
l ju 1 Ll jk T jv 1 l jk 
Tin Vk 1 Trev ej 1 


Iquation (36) is Schlafli’s celebrated differential recursion formula. 
His many relations connecting the angles of the boundary simplexes are 
familiar to the statistician as identities among partial correlation co- 
efficients. 

We close this section with a simple demonstration that for odd n, P, 
can be expressed in terms of the content of lower dimensional simplexes. 
Let p; denote the probability that X; be nonnegative, p;; denote the 
probability that X; and X,; be nonnegative, etc. Then P, = py...» . Set 
M, = =p:, Mz = Dose; pi; , etc. Then from the well-known inclusion 
and exclusion formula, the probability Q, that none of the variates be 
nonnegative is . 


Q,=1-M,+M.—---+(-1)"M,. 
But from symmetry, P, = Q, = M,, so that 
(i —(-1)"]P, =1—-M,4+ M,—---+(-1)""'M,... 


(Cf. Sommerville,” Chapter IX, Section 1.9.) No recursion is known for 
even 7. 
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2.2 Proof of Lemma 1 


Lemma 1 follows directly from (35). Note that in the derivation of 
this result, it was not necessary to normalize the covariance matrix. 
This result thus states that if o is a position definite symmetric matrix, 
then 


dP..(0) 
Api; 
with P,( 9) defined by (5). 
Now let r and q be nonnegative definiten * n symmetric matrices with 
ri = Qi = 1. Theno = Ar + (1 — A)q 4+ cl, whereI is the n X n unit 
matrix, is positive definite for each ¢ > O and each dX satisfying 0 S 
A < 1. Consider P,(9) as a function of X. It is readily established that 
P,,(@) possesses a continuous derivative and indeed that 


dx j>i Opi; ar 


dP n(o) (r:; 
j>i Opi; : 


IV 


0, jg>t, (37) 











a dij). 


If now ri; 2 Qij3,7 > 4, (87) then gives 


dn 


Integration on d from 0 to 1 yields P,(r + el) = Pr (q + ¢€ TD). From 
well-known continuity theorems (see Cramer,’” Section 24.3 and 10.7), 
Lemma 1 follows by letting ¢ tend to zero. 


= 0. 


2.3 Proof of Theorem 2 


Let r(7r) and q(r) both be of class a > 0 and suppose that r(7) = 
q(r) forO S$ + S T,. Then for any \ > 1, 


r(r) 2 g(r) 2 r(ar) 
0<7S7,(A), 
for some suitable 7:(A). By Theorem 1, then, and the scaling law (2) 
PIT r(7)] 2 PIT,¢q(7)] 2 PT, r(7)] 
0ST S 7A). 


To see how best to choose \ to obtain a good lower bound for P[T’,q(7)], 
it is convenient to define a version of h(r) = r"[q(r)]. Let 7, be the 


(38) 


(39) 
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smallest value of r > 0 for which g(r) is not decreasing. (Strictly speak- 
ing, t, = inf. of those 7 for which g(r) is not strictly monotone for 
0<7sS TIT. If this T set is empty, define 7, = ©.) Define 7, in an anal- 
ogous manner. The function 7 “(q) is defined for 1 = gq = r(7,-) by the 
branch having values between 0 and 7,. Similarly we define q‘(r) for 
1 2 r 2 q(7,) by the branch having values between 0 and 7,. If g(7¢) S 
r(tr), we define h(r) = r'[q(r)] only for 0 S 7 < q'[r(7,)]. If q(7_) = 
r(rr), we define h(r) forO0 S 7 < 7,. Clearly h(0) = O. As 7 increases 
from zero, h(7) is at first at least as large as 7, since r(r) 2 g(r) near 
7 = 0. For small 7, r(h) = q(r), so that h’(r)r’(h) = q’(r) or 


, a-l l-a 
MOD acim OS aie tha (2) = h'(0+)*™, 


t>0+4+- r’(h) 7 t>04+ Ae} t>0+ T 








so that h’(0+) = 1. Three typical curves for y = h(7) are shown in 
Fig. 1. Note that h(7) is strictly monotone in its domain of definition. 

Consider now the plots of y = h(r) and y = Az as shown on Iig. 1. 
For all values of 4, these curves have the origin as a point in common. 
When Xd = 1, the straight line y = Xz is tangent to y = h(r) at the 
origin. As d is increased from 1, a second point of intersection moves 
out from the origin. It may happen, as in Fig. 1(a), that the line y = 
Ar becomes tangent to y = h(7r). If so, we denote by 7T* the abscissa 
of the first such point of tangency as A increases from unity and we de- 
note the corresponding value of \ by A*. If no such tangency occurs, we 
denote by 7* the largest value of 7 in the domain h(7). In this case we 
set AX* = h( T*)/T*. (Note that \* may be infinite.) Observe that for a 
given \ < A*, the abscissa of the first point of intersection of y = Ar 
with y = h(r) to the right of the origin, say 7, satisfies h(71) = An 
or g(71) = r(A71). For + S 71, the right equality of (88) maintains; 
for 7 = 1 + «r(Ar) > g(r) for small positive e. 

The lower bound P[AT,r(7)] on the right. of (38) is a nonincreasing 
function of \ for a fixed 7. For a given T < 7%, then, this bound is made 
as large as possible by choosing \ as the smallest value greater than unity 
for which q(T) = r(AT’). With this choice, \7' has the value h( 7’) and 
Theorem 2 is proved. The largest 7* for which the theorem as stated in 
Section 1.3 is true is the value 7* defined in the previous paragraph. 

Note that if r(r) and q(7) cross at 7 > 0, Le., r(7o.) = g(r), T* is 
necessarily less than 7, , for in this case, y = h(7) crosses y = 7 at 
as in lig. 1(a) and a tangency occurs as indicated. 


2.4 Proof of Theorem 3 


Let 7, > Oand T, > O be given and set 7; = 7, + T.. Consider 
the approximation to P[T3 , r(7)] given by the probability P,,(r) that 
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X(h),- i. ea ree) X (7), : “4X Fig) 
all be nonnegative. Here 0 = th < bh < +--+ < th, = T, is a partition 


of (0,71) and Ty < m < 72 < +++ < m, = 73 is a partition of (T,, 
T, + Te) and m + ne = n. The covariance matrix r can be written in 


block form 
ee A B 
~ \B C/? 


where A is an nm; X mm normalized covariance matrix with elements 
r(t; — t;) C is an mz X ne normalized covariance matrix with elements 











Fig. 1 — The curve y = h(z). 
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r(7; — 7;), and B has m rows and nz columns and elements r(t; — 7;). 


Now 
= A 0O 
~\O C/’ 


is also a covariance matrix, and if r(r) = 0 for0 S 7 S Tz, the ele- 
ments of r are not less than the corresponding elements of f f. yoni Lemma 
1, it follows that P,(r) 2 P,(f). But f is the covariance matrix for two 
independent sets of random variables so that 


P(t) 2 Pa(f) = Pn(A)Pn(C). 


By refining the partition with mesh tending to zero, one has P[T3 , r(r)] 
= P[T,,7r(7)|P|T2,7r(7)] and the theorem is established. (It is trivi- 
ally true if 7 or 7. or both are zero.) 


2.5 Proof of Theorem 5 


Theorem 5 is a consequence of the following more general 

Theorem 11 — Let the random variables X; , X2,++-,Xn,n > 2 have 
a joint density p(t, -°°,%n) with the property p(—%,--+,;—%n) = 
(21, °**,0n). Then 


Pr{Xs2 0,4 = 12,---m) $ —5 + 7 DPX 0,X,; = 0}. 
tS] 


lA 


44 


The proof of this theorem follows that of a theorem by Gaddum 
concerning spherical simplexes and their angle sums. We introduce the 
following notations: P;; = Pr(X; 2 0,X; 2 0),P =Pr{x:20,7= 
1,2,° 5 +n}, R(a » G2, °° * Qn) = Pr{aXxX = 0, a2Xq 2 0, 2s *,AnXn = 0}, 


a; = +1,7 = 1,---,n. Thus P = R(1,1,---,1) and 
» R(aq , a2, ** Gn) = ie 


A1,°°*,8n 


where in the sum each a takes values +1 and —1. The 2” symbols R 
are equal in pairs; 


Ra, , 2, °**,0n) = R(—m, —d2,+++,—Qn). 


We call R(—a, —a2,---,—Gn) the complement of R(a, dn, +++ ,Qn). 
One has 


Py = P+ >’R(1,1,a3 »44,°° An) 
Py, Peep 2'R(1,a2 ’ 1,a4 » 7" Qn) (40) 


Pe POS ia es 
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Here the R symbol on the right of the equation having P;; as left mem- 
ber has a 1 in the 2** and 7 places and a’s elsewhere. In each equation, 
the sum is over all combinations of plus and minus 1 for the a’s except 
for the combination all a’s plus 1. . 
Now consider adding the n(n — 1)/2 equations (40). One has 

do Pi; = [n(n — 1)/2]P + 8, 

tS] 
where S is the sum of all the sums of R symbols on the right of (40). A 
given R symbol with precisely 7 of its arguments +1 will occur j(j — 1)/2 
times in S, 7 = 2,3,---,2 — 1. Denote by 7; the sum of all R symbols 
that have precisely 7 of their arguments +1. Then 


_ n(n — 1) 


PO a ee a 
| Be tot (41) 
j=2 2 = 

Now 

cay ie ee | n—2 =e es | 

pce n= pe ’ j din, 
so that 
et ey en | i ey _ eee | 
ya pier |e ) ip, 4 in ve j veep 


But since an R symbol and its complement are numerically equal, 7; = 
T,n-;, so that (41) becomes . 


pe ag ee 


t<7j 2 
1 > [Hs 1) (n j)(n j 1) 7 


Vet 


Now, for 7 = 2,3,---,n — 2, 
19 =) ea esd Sd) > rn — 2) 


2 2 iv 4 : 
so that 
y eye i pe a 
<j 


4 nn — 2) > 7, > n(n = 2)», n(n — 2) S 7. 
8 j=2 2 8 j=1 
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However, the last appearing sum is 1 — 2P and Theorem 11 follows 
directly. 

In the case of a Gaussian process X(t) with normalized covariance 
function r(7), we consider the application of Theorem 11 to the random 
variables X; = X(iT/n),t = 1,2,---,n. Then from (6), Pi; = 4 + 
1/2m aresin r[(¢ — 7)T/n]. By taking limits as n becomes infinite, The- 
orem 11 then yields 
21 


r yu 
P(T,r(7)] Ss - | dy | dx aresin r(y — x). 
aw T? Jo 0 


Elementary manipulations then lead to the result stated as Theorem 5. 


2.6 Proof of Theorem 6 


Consider n random variables, X,, X2,°::,X,, and the following 
mutually exclusive events: (A) the variables are all nonnegative; (B;) 
the first 7 variables are nonnegative and the (j + 1)* is negative, 7 = 
1,2,3,---,n — 1. The union C of these events is the event X, 2 0. We 
suppose Pr{C} = 4 and write P, = Pr{A},V; = Pr{B;}, 7 = 1,2,---n—1 
so that 


n—-1 
ae 
P,=i- V;. 


7=1 


But V; S$ Pr{X, = 0, X; = 0, Xj41 < O},7 = 2,--+,n — 1 so that 


n—l1 
P, = 4 — Pr{X, = 0, X. S$ 0} — DS Pr{X, = 0, X; = 0, Xj41 < O}. (42) 


7=2 


Consider a stationary Gaussian process X(t) with a class 2 covariance 
r(r). In (42) set X; = X(jT/n). From (7), one obtains 


Pr{X, = 0,X; = O,X j44 < 0} 


1,1 7 ee ee ee ke 
5 + ve | aresin r gE — 1) 7] — arcsin r E | — arcsin r || ) 


and from (6) 


1 1 ; T 
> <0} =--— eat 
Pr{X; = 0, X2 S 0} a7 9, aresin r (7) 
Insert these values in (42) and pass to the limit as n becomes infinite. 
Theorem 6 results. 
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2.7 On Class 2 Covariances 


Let r(7) be a class 2 covariance. I’'rom the Bochner representation 
r(r) = ia cos Ar dF(\), 
where we now have 
1 = [ dF() = [ NdF(X), 


it is not hard to show that 7 is continuous, that r’(7) exists everywhere 
and is continuous, and that r”(7) exists and is continuous everywhere 
except perhaps at 7 = 0. 

If the process X(t) with mean zero has r(7) as its covariance func- 
tion, then the four random variables X(0),X’(0),X(t),X’(t) have 
covariance matrix 


1 0 r r’ 
0 1 —r' —7" 
, —r’ 1 0 
r —r” 0 1 


where we write r = r(t),r’ = d/dt r(t),r” = d’/dt’ r(t). For this to be 
a nonnegative definite matrix it is necessary that the determinant of 
all major diagonal submatrices ke nonnegative. Evaluating these deter- 
minants, one finds the system of differential inequalities 


(l—r— Pr?) —r? — 2”) = (rr + rr”)? = 0, (43) 


l—-r—-r’ 20, (44) 
l1—-r—-r’ 20, l—r®—r”? = 0, 
1—r 20, 1—r’ 20, 1—-r”’ 20 


These inequalities can also ke derived without raising the question of 

existence of the derivative process by demanding that the covariance 

matrix of the four random variables X(0), X¥(e«) — X(0), X(¢), X(t+e) 

— X(t) be nonnegative definite for arbitrarily small values of e. 
Consider now the family of covariances 


r2(B,r) = 1 — Bp + B cos ): 0586 Sl, (45) 


introduced in Section 1.1. In what follows, we shall be concerned with the 
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family, F, of curves r = r2(8,7), where for each 8 with 0 < 6 S 1 we 
restrict our attention to the interval 0 S 7 < 7@. Several members of 
the family are shown in Tig. 2. The following statements, evident from 
the figure, are easy to prove analytically. (1) The curves of the family 
do not intersect each other except at 7 = 0. (2) A horizontal line r = 
r, With | r. | < 1 intersects exactly once each member of F with param- 
eter value in the range 1 = 8B = ~VW/(1 — 7,)/2. For each value of a 
satisfying —~/1 — r.2 S a S 0, there is a unique member of the famly 
that intersects the line r = r, with slope a. If 8(a@) denotes the param- 
eter value of this member of F, 8(a) is a continuous strictly monotone 
decreasing function of a, —~/1 — r2 Sa S 0. 

We shall say that the curve r = r(7) intersects the curve r = g(r) 
from below if at the point of intersection r’ > g’. 

Lemma 2 — Let r(7r) be of class 2. 

a. If the first local minimum of r(r) ts at 7, then r = r(7r) cannot 
intersect from below any member of the family F, 





r= n(87) = 1 — 6° + 6% c08(), 0<7rS xf, 056 1, 


in the intevalO0 S77 S714. 

b. If r = r(r) passes down through the point (7.,17.) with slope r,’ 
satisfying —~/1 — r2 S ro! S 0, then there is a unique translated member 
of , say r = ro(B., 7 — mw) which passes through (1,1) with slope r.’. 
If ro(Bo, T — w) and r(r) are nonincreasing for7 S 7+ S 1, then r(7) 
Sr(8B.,7—pw)forr7r7Srsn. 





1.0 


a =t+— — 
0.6 





B=0.5 
0.4 
0.2 
-0.6 


Sa 
el 











7 IN BECHELS 


Fig. 2 — The family F. 
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Proof — Part a of the lemma will be deduced from part b. The first 
conclusion of part b is the remark (2) above. The second conclusion of 
part b follows from the inequality (43). If |r| # 1, this latter can be 
written by elementary algebraic manipulations as 

1 par” rr” eosk par 


ED, ON SIP, 0M ee ee op es 
1l-—r Sr r7oRé 1-r °* 


The right-hand inequality can be rewritten as 


rt r? 1 
cn re a a at Sle 
Com Ga ae 
or, if r’ S 0, as 


an” 
2rr 


Cae 


Or? Or’ 


+ aaa 


or 
12 
ee ee > 9 adil 
dr(l1—r)?~ dri-—r 
Integrate this expression from 7 to 7, with 7 < 7, to obtain 
12 12 
i 2 
es _ ee < NG? — ——_., (46) 
(l-r)? l—r~(l—7n)? 1-—7%% 


where the subscript o refers to quantities evaluated at 7,. Denote the 
right member of this inequality by — 1/h’, and note that, as is indicated 
by the notation, 


1 — 2(1 =< To) i ee (1 + ro) ai i) ri To = 1- To” = To’? 





ae ae kee = > 0 
h? ou) a (l= 73)? (Ea pe ==? 
by (44). Inequality (46) now becomes 
=r < 4 (Gry 

or what is the same 

ri - Ciep) ey): 
where 

\=1— 2h’. (47) 

It follows then that 

a eee | 

V(l—r)\(r—r) ~h? 
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with h a nonnegative quantity. Integrate this again from 7 to 7, to ob- 
tain 


arcsin _2-— ORam0 eee se (1 +4)/2 5 sete 


(1 — d)/2 (l—~A)/2 7— h 
Thus one finds 
1+r, 1-drA. Ee 
sin | : 








r(r) S 


2 a 2 





. % — (1+A)/2 
h ea aT De a | (48) 


= q(r). 


This inequality is valid in a r-range to the left of 7, until either g(r) or 
r(7) has a local maximum. 
Now by (47), g(7) can be written 





g(r) = 1 — B+ eos (74), 


for suitably defined yu, and one finds by using the various definitions 
Q(t.) = To 
Gta) See: 


Thus g(7) is the member of the family / which, when translated in the 
7-direction, passes through the point (7,7) with slope 7,’. To the left 
of 7,, the curve r = r(7) remains below this translated member of F. 
Part b is thus proved. 

Now suppose that r = r(7) intersects a member of the family / from 
below, say at (7.,7.) with + < 7,. Let the parameter value of this 
member of F be B,. Since 0 2 r’(7.) > 12'(B., To), the translated mem- 
ber of F passing through (7,.,7r.) with slope r’(7,.) has a parameter 
value 8 = 8, < @,. This translated version of r = r.(@;, 7) has no local 
maximum in the interval (0,7.), and its value at 7 = 0 is less than unity. 
One thus has the contradiction r(0) < 1 and the lemma is proved. 


Theorem 12 — Let r(r) be a class 2 covariance. Then 
r(r) 2 cos 7, OS7rET. 


Proof: In a region where r’(r) S 0, inequality (44) implies 
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Integrating from 7, to r > 7, assuming that r’(7) S 0 throughout 
(7,7), one finds 


— (r — 7.) + arccos 7, S arccosr S (r — 7) + arccos 7, , 


where 7, = r(7.). This in turn implies cos[r — 7, — arccos 7] 2 r(r) 
and r(r) 2 cos[r — 7. + arccos 7], where the former inequality holds 
from + = 7, until the cosine assumes the value unity, and the latter 
inequality holds from 7 = 7, until the cosine assumes the value minus 
unity. The result may be stated as follows: Let the class 2 covariance 
r(7) pass downward (= not upward) through the point (7. ,7.) in the 
7-r plane. The curve r = cos 7 can be translated in the 7-direction 
to pass downward through (7,,7,). Then to the right of 7, r = r(r) 
les above this translated cosine curve until either the cosine curve or 
r(7) has its next local minimum. Similarly, a cosine curve can be trans- 
lated to pass up through (7, ,7,). To the right of 7, , r = r(7) lies below 
this translated cosine curve until either r(7) has its next local minimum 
or the cosine curve has its next maximum. 

A similar result holds if r(+) increases through (7, , 70). 

Now let 7, = 0,7. = 1. Then r = r(r) liesabove r = cos 7 until the 
first minimum of either. If the first minimum of r(7) occurs at 7 2 7, 
the theorem is proved. Suppose now 7; < m and that r = r(7) crosses 
r = cos 7 in (0,7). The first such crossing must be downward, since 
r(r) = cos 7 from 0 to 7. If the crossing is at 7, then r(7) = cos 7, 
and r’(7) S — sin 7. If indeed r’(7) < — sin 7, one obtains from (43) 
the contradiction 1 = r°(7) + r?(7) > cos’ + sin’? = 1. On the 
other hand, if the crossing takes place with r’(7) = — sin 7, then b of 
Lemma 2 shows that r(7) S cos 7 for + < 7 which contradicts the 
assumption that the crossing was downward. Thus, the theorem is 
proved. 


Theorem 18 — If r(r) ts of class 2 and 


bol 


r(r) = 0, 0871S U%, 


then 


1 ; 
rir) 2 re (Ja-7) =ltlesvV/2r= cos (- a) 





for 


=) 
IA 
bee | 
NA 
Sh 
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The theorem is a consequence of repeated applications of Lemma, 2. 
We prove the theorem by supposing it false and then arrive at a con- 
tradiction. We refer to the curve r = 2(1/+/2,7),0 S$ 7 S r/V/2asC. 

Suppose now that r(7) = O for 0 S 7 S 7/+/2 and that some point 
P, on r = r(r), say (70,70), lies below C. Denote r’(7,) by 7,’. We can 
suppose P, chosen so that r.’ < 0, since r = r(7r) cannot be nondecreas- 
ing at all points where it lies below C. Let the horizontal line r = r, 
through P, intersect C at P; and denote the slope of C at P, by C’(17.). 
The point P; has larger abscissa than the point P,. The curve r = r(7) 
possesses a continuous derivative. As the height r, of the horizontal line 
r = 1, is continuously decreased to zero from its initial value, a value 
must be found with P, to the left of P; andr,’ = C’(r,). By b of Lemma 
2, a curve of the family F with parameter value B S 1/+/2 can be trans- 
lated to the left to pass through P, with slope r,’. In the interval 0 < 
rt = 7, this translated member of F lies strictly below C and is mono- 
tone. The first local maximum of r = r(7) to the left of P, therefore 
lies below C as must also the local minimum just preceding this maxi- 
mum. A curve of F can then be translated to pass through this local 
minimum with slope zero, and repetition of the argument shows that 
all local maxima of r = r(r) forO S 7 S 7, lie below C. In particular 
r(0) < 1, which contradicts the initial assumption concerning r(7). 
Q.E.D. 


Theorem 14.— Let the covariance r(r) have the behavior 
2 4 
T T 4 
r(7) = 1 x tmz +t or), 


near r = 0. Then 


r(r) Sf Ga 0 


with r2(B,7r) gwen by (45). Here T; = min(6z,7.) and 7, ts the smallest 
positive value of r for which r(r) = 1 — 2/m. 

Proof — The first four derivatives of r(7) exist at 7 = 0. From the 
Bochner representation for r(7), it is easy to show using Schwaraz’s 
inequality that 


IIA 


aoe ae Gee 


v=m—-120. (49) 


It also follows that r”(7) exists everywhere and is continuous. 
The Gaussian process X (¢) having covariance r(7) has first and second 
derivates X’(t) and X”(t¢) almost everywhere with probability 1. The 
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covariance matrix of the random variables X(0),X(t),X’(t),X”(#) is 


Mu 


1 r or e 
r 10 -I 
ee 0 1 0|° 
r” —l1 0 m 


The determinant of this matrix cannot be negative. This is equivalent 
to the inequalities 


” 
—vs qa =v, 
In any region where r’ < 0, the right-hand inequality gives 
Gon - EVI ae or 
Integrate this from 0 to 7 to obtain 
V1l-— Pr —r? S vo(1 — 1). (50) 
Note that if 7, is the first positive value of 7 for which r’(7) = 0, (50) 
gives 
v—-1 
e+ 1° 
Thus we have the interesting side result that if r(7) is everywhere non- 


negative v’ = lorm = 2. 
Squaring the inequality (50) and rearranging the terms, one finds 


Pe Chay) (aire), 


r(m) S 





where 
v cial, 
ONS ae | > (51) 


Since 7’ S 0, this implies 


/ 





r Tas 
aS 2 
Viana wee 
if r > a. Integration from 0 to 7 yields 
, Pe Sale ag. 5 
aresin (i — a)/2 9 = Jil +v es 


496 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


where it is assumed that + S 7,. If then 





5~Viterl ss, 
r—-(l+a)/2_. (5 - Fa) 
“Toone == 5 V1 + vr 7 


or, what is the same thing in virtue of the definitions (49) and (51), 


rir) S1— - + = cos (mr). 


The theorem is thus proved. 


2.8 Proof of Theorem 9 


Let A(é) be nonnegative for 0 S — S 6 and zero elsewhere. Then 


t e) 
yi) = fae ext) at = [ duhw)xte = w) dey, 

t— — 00 
will certainly be nonnegative for 0 S ¢ S T whenever X(¢) is nonnega- 
tive for —6 S$ t S T. The probability that the Y process be nonnega- 
tive in (0,7’) is therefore not less than the probability that the X process 
be nonnegative in (— 6,7’). If X is Gaussian with mean zero and covari- 
ance r(7), then Y is Gaussian with mean zero and covariance 


ra(7) 


ll 


EY()Ya@+1) = [uf aw Aw Who) EX(t — uw) X(t-+ 7 — 0) 
- [ du a dv h(u)h(v)r(r — u + v) 


= [ dz r(r — 2) ‘ dé h(x + £)A(E). 
One has then P[7',re(7)] 2 P[T + 6,r(7)], which is Theorem 9. 


2.9 Proof of Theorem 10 


LetO=t}<t<--- <t, = T bea partition of (0,7). Define Q,(r) 
by 


Pr (X(h) < 0, X(é;) 2 0,7 = 2,3,---,n) 


Pr (X(t) < 0, X(t) = 0) ; (52) 


Q,(r) = 





where X(¢) is a Gaussian process with zero mean and class 2 covariance 
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r(7). As the partition is refined with mesh tending to zero, Q,(r) ap 
proaches Q[T,r(7)] as a limit. The numerator on the right of (52) i 
P(t) where 


1 —1r(t) —r(ts) peas —r(tn) 
—r(to) 1 r(tg = ty) eas rhs —= i) 
a —r(ts) r(ts — ty) 1 ses rt, = ts) : (53) 
et t(tn — te) r(ty — ts) nae 1 





and as usual P,(r) denotes the probability that n normal variates of 
mean zero and covariance matrix r Le nonnegative. Note that the de- 
nominator of the right of (52) depends cnly on r(te). 

Let another Gaussian process, Y(t), have class 2 covariance g(r). 
Wedefine r '(r),¢ (7), h(7) = r '[q(r)] asin Section 2.3 and set g(t) = 
g [r(t)] = h-'(t). Note that g(t) is strictly monotone within its domain 
of definition. Assume that 7 is within the domain of definition of g. 
With the points ¢; see as in (52), set 7; = g(t;),7 = 1,2,---,n. The 
points 0 = 11 < 72 < +++ < 7 = g(T) form a partition of the interval 
(0,g9(7)). The mesh of this partition tends to zero with mesh of the 
i; partition. 

Consider now the approximation to Q[g(7),q(7)] given by 


Eri Vow) SOE Ge) 2 Oe = ty 28m) 
Pe {YCri) <0, ¥-Crx) =-0} 


The numerator here is P,,(q) where q is given by (53) with r replaced 
by q and ¢ replaced by +. Since 7; = g(ti), g(7:) = r(t:), 2 = 1,2,---,n, so 
that the first row and column of f are the same as the first row and 
column of q. For any other entry of f with ¢; 2 t; , one has 


r(t; — tj) = glg(t: — t3)] 
= gris — 73 + [g(t — 7) — g(t) + gts}. 
Since g(7) 1s nonincreasing 


r(ti =f iS Gre = T;) 


and hence by Lemma 1 


Qn(q) = (54) 


P,(t) S P,(4), 
provided 
g(t: = 3) — g(&) + o(Q) 2 0 
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or what is the same thing, provided 


g(x) + gly) 2 g(a + y), (55) 


whereO Sv=t;<t=at+y. 

When (55) is satisfied, the numerator of (54) is not less than the 
numerator of (52). The denominators of these expressions are equal 
since they are the same function of r(t.) = q(72). Therefore, Q,(q) = 
Q,(r). The conclusion of Theorem 10 results by passing to the limit as 
the ¢ partition is refined. 


2.10 Generalizations 


A number of the results presented in this paper can be generalized in 
a direct manner. We only mention here an obvious extension of Theorem 1. 

In the derivation of Lemma 1, the lower limit of integration for x; in 
(33) can be replaced by a;. Now choose a: = a(t:) with a(t) a given 
function defined for 0 $ ¢ S T, and where the points t; form a partition 
of (0,7'). Proceeding as in the derivation of Theorem 1, one arrives at 
the following more general result. Let X(t) be a Gaussian process with 
EX(t) = 0, HX(t)X(s) = r(s,t). Let Y(t) be a Gaussian process with 
EY(t) = 0, EY(t)Y(s) = q(s,t). Then if 


r(s,s) = g(s,s), OSsST 
and 


r(s,t) 2 q(s,t), OSstsT 


Pr{X(t) = a(t),0 Si S T} = Pr{ Y(t) = a(t),O St S Th. 


2.11 Asymptotics 


As already remarked in the introduction of this paper, there appears 
to be little in the literature concerning the asymptotic behavior of 
P{T,r(7)| for large 7. Intuition would indicate exponential falloff for a 
wide class of covariances. Example (27) of Section 1.1, though special 
in nature since r2(6,7) is periodic, provides a counterexample to expo- 
nential kehavior, and so the class must be carefully defined. Here, by 
the two bounds presented in Section 1.4, we have shown exponential 
behavior for nonnegative covariances that vanish identically for 7 
greater than some 7, > 0. Recently, by using Theorem 1, M. Rosenblatt 
has established an asymptotic exponential upper bound for P[T,r(7)] for 
all covariances which are ultimately majorized by a decaying exponen- 
tial. This, together with the lower bound of Section 1.4, establishes the 
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asymptotic exponential behavior of P[7,r(7)] for all nonnegative co- 
variances that themselves decay exponentially. Professor Rosenblatt has 
also established that if r(7) — 0 with increasing 7, then T”P[T,r(r)] > 0 
with increasing 7’ for every n > 0. 

We conclude with the remark that from (23) of Section 1.7, one can 
show that asymptotic exponential behavior of P[7,r(7)] implies asymp- 
totic exponential behavior for Q[7,r(7)]. 
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Probability Distributions for the Phase 
Jitter in Self-Timed Reconstructive 


Repeaters for PCM 


By M. R. AARON and J. R. GRAY 
(Manuscript received August 25, 1961) 


Probability distributions for the timing jitter in the output of an idealized 
self-tumed repeater for reconstructing a PCM signal are approximated. 
Primary emphasis is focused on self-timed repeaters employing complete 
retuming. In thts case the probability distribution for the timing jitter reduces 
to the computation of the phase error in the zero crossings at the output of 
the tuned circuit excited by a jitter-free binary pulse train. It 1s assumed 
that the tuned circuit 1s mistuned from the pulse repetition frequency, and 
the individual pulses are either impulses or raised cosine pulses. Both 
random pulse trains and random plus pertodic trains are considered. In 
general, the probability distributions are skewed in the direction of increasing 
phase error. The approach to the normal law in the neighborhood of the 
mean when the circuit Q becomes arbitrarily large is demonstrated. Results 
obtained from the analytical approach are compared with two computer 
methods for the case of random impulse excitation of a tuned circuit char- 
acterized by a Q of 125 and mistuning of 0.1 per cent. Excellent agreement 
between the three techniques is displayed. For no mistuning and rarsed 
cosine excitation two methods for computing the phase error are given and 
numerical results obtained from both techniques agree closely. 

Some attention ts given to an idealized version of a reconstructive repeater 
employing partial retiming and it is shown that the timing performance of 
such a repeater for random signals is very much inferior to the completely 
retumed repeater. 


I, INTRODUCTION 


Over the past several years the problem of maintaining pulse spacing 
within very close bounds in PCM transmission has received considerable 
attention both theoretically and experimentally. The effects of timing 
jitter in degrading repeater performance, in introducing distortion in 
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the decoded analog signal, and in enhancing the difficulty of dropping 
or adding several pulse trains in time have been documented.’® Sources 
of mistiming in a self-timed reconstructive repeater are well catalogued 
and include: noise, crosstalk, mistuning, finite pulse width effects, and 
amplitude to phase conversion in nonlinear devices. The first four of 
these sources have been considered in various analyses of timing jitter 
in self-timed and separately-timed PCM repeaters. Amplitude to phase 
conversion in nonlinear circuits has received attention primarily from 
the experimental viewpoint. 

The majority of the theoretical work to date has been concerned with 
timing errors in self-timed repeaters when the timing-wave extractor is 
a simple tuned circuit. For a random pulse train exciting the tuned circuit 
in the presence of noise and mistuning, results have been obtained for 
the mean displacement and the standard deviation of the zero crossings 
from their ideal location. This analysis is appropriate to repeaters em- 
ploying complete retiming. These time displacements can also be 
considered as phase errors and we will use this terminology in what 
follows. If the probability density function for the phase error is normal, 
the mean and standard deviation are sufficient for a complete statistical 
description. In this paper we will show that in general the probability 
density function is not normal, and is inherently unsymmetrical about 
the mean. 

An approximation to the probability density and the cumulative 
distribution for the phase error at the output of a mistuned resonant 
circuit will be derived for both random and random plus periodic pulse 
trains. A completely random pulse train is defined to be one in which 
pulses and spaces are equally likely. The individual pulses of the binary 
pulse train are assumed to be jitter free and are either impulses or raised 
cosine pulses. The approach to the normal law when the circuit Q is 
large is demonstrated. For a value of Q of 125, and a mistuning of 0.1 
per cent from the pulse repetition frequency a comparison of numcrical 
results obtained from the analytical approach and two computer methods 
is made. Agreement among the three approaches is excellent. 

Our plan of attack is to place all of the manipulations required to 
specify the tuned circuit response to the most general pulse trains in 
the Appendix and concentrate on most of the probabilistic notions in 
the main body of the paper. Appendix A covers the response of the 
tuned circuit to a random or random plus periodic binary pulse train of 
arbitrary pulse shape, and Appendix B is concerned with the specializa- 
tion to raised cosine pulses. Section IT of the text deals with the terminol- 
ogy required, covers the tuned circuit response to impulses, and briefly 
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summarizes the results of Appendices A and B. In Section III, the 
probability density function for the phase error is derived. Section IV 
is devoted to the cumulative distribution function and Section V alludes 
to the semi-invariants that are required in the evaluation of the density 
and cumulative distribution functions. These semi-invariants are de- 
rived in Appendix C. The approach of the probability density function 
for the phase error to the normal law as the circuit Q becomes arbitrarily 
large is displayed in Section VI with the algebraic support relegated to 
Appendix D. The comparison of numerical results mentioned previously 
with other computer approaches is made in Section VII. For zero mis- 
tuning, but finite pulse width excitation, it can be shown that the proba- 
bility distributions for the phase error can be related directly to the 
probability distribution for the timing wave amplitude. This is demon- 
strated in Section VIII. A discussion of further numerical results is given 
in Section IX. We consider an idealized model of a partially retimed 
repeater in Section X for purposes of comparison with the results of 
Section. IX. A wrap-up of the procedures, results, and future work 
concludes the paper. 


II. RESPONSE OF THE TIMING CIRCUIT 


Before we go on to the general equation for the phase error due to 
finite pulse width and mistuning, we will specialize to impulse excitation 
of a simple tuned circuit characterized by its Q and mistuning from the 
pulse repetition frequency. This should provide the casual reader with 
some feel for how the more general equation for the phase error arises 
without going through the detailed manipulations of Appendices A and 
B. The procedure adopted in the analysis to follow is equivalent to that 
of H. E. Rowe.’ 

Assuming the input to the timing circuit to be a train of jitter-free 
unit impulses occurring at random with spacing 7’, the excitation may 
be represented as 


f(t) =D a8(t — nP), (1) 


where a, is a random variable taking the values 0 or 1 with probability 
3,*° 6(t — nT’) is a unit impulse whose time of arrival is n7’, and the 
spacing T' is the reciprocal of the pulse repetition frequency f,. For a 
parallel resonant circuit the impulse response is given by 


* Unless otherwise specified, the case of equal likelihood will be considered in 
all calculations, 
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h(t) = Ae 7!4! cos (Qafot + 9), (2) 


where 


1 1/1 Lee 
=e ee es ance — ae 2 

Lge Vt (sz0) » A= 59g VA +1, 
tan” 0° 
Here f, is the natural resonant frequency as distinguished from the 
steady-state resonant frequency f; = (1/2r)+/1/LC. Combining (1) 
and (2), the total response to all impulses occurring in time slots up to 
and including the one at ¢ = 0 may be written as 


Q = 2rf.RC, and re) 


n=0 
F(t) = A Ya, OP” cog aft ~ nT) +g]. (3) 
This expression gives the output of the timing circuit for values of ¢ 
in the interval between ¢ = 0 and the arrival time of the next impulse. 
Rewriting (3) in the form of a carrier with both amplitude and phase 
modulation we get 


F(t) = AVa? + yp? "”* cos [2afet + 9 + 4], (4) 
where 


6 = tan? 


? 


Sie 


wo 
= ont 
t= > a, 6!" cos 2nfnT, and 
n=0 


ive) 
y = dane"? sin QefnT. 
n=0 


In the above x and y represent the in-phase and quadrature components 
of the response. If the tank could be tuned exactly to the pulse repetition 
frequency (f, = f, = 1/T), then the phase modulation would disappear 
and the amplitude modulation would be dependent on x alone. In prac- 
tical applications this is not possible and the phase shift @ does occur. 
If we denote the fractional mistuning Af/f, by k, we may write f, in 
terms of f, as follows 


he = fl i ky: 


In this case (4) becomes, neglecting k with respect to unity in the 
exponential term 
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F(t) = AVxe® + ype 7" cos Qaf-(1 + ht + eo + 8, (5) 
with 


e@ fan 


1 
Mes 


Qn cos 2rkn, 


3 
ll 
° 


e 7/®) n 


& 
\| 
Mes 


Qn sin 2rkn, 


3 
I 
° 


and 
6 = tan” y/z. 


To illustrate the relationship between the timing deviation tg and 
the phase error 6, it is assumed that repeater delays have been adjusted 
so that the timing wave supplied to the regenerator in the absence of 
mistuning is properly aligned with the signal impulses in the information- 
bearing channel. In this case, the negative-going zero crossing occurring 
ideally at t, = 7/4 determines the instant of regeneration. When mis- 
tuning is present this zero crossing is displaced such that it occurs at 
the instant ¢,’ = 7T(+ — 6/27). The difference ¢, — ¢,’ will then give the 
timing deviation which, expressed as a fractional part of the pulse 
spacing, is 

la 7] 
T= o (6) 

From (6) and the definition of 6, the phase error corresponding to 
the timing deviation is related to the random variables x and y by 


6 = tan, (7) 
xv 


In deriving (7) it should be recalled that only the incidental approxima- 
tion k « 1 has been made. When we consider a binary pulse train in 
which the pulses representing the binary ‘“‘one” are of arbitrary pulse 
shape, it is necessary to make other approximations to arrive at a tract- 
able expression for the phase error. Furthermore, the excitation en- 
compasses the infinite past as well as the tails of succeeding pulses to 
accommodate driving pulses that may overlap or are not time limited. 
The most general result given by (59) is an extension along two lines 
of Rowe’s relationship for the timing jitter in the output of the tuned 
circuit due to mistuning and finite pulse width. First, the results are 
applicable to arbitrary pulse shape. Secondly, our relationship for the 
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phase error is based on a different approximation in the case of finite 
width pulses. 

In appendix B we specialize to the case of raised cosine pulses in order 
to make use of some of Rowe’s results. I‘or this case the phase error is 
given by (73) and takes the form 





y+ta 

ees +e, (8) 
where a, b, and ¢ are constants that depend upon Q, k, and the pulse 
width 7'/s of the raised cosine pulse. z and y are correlated random vari- 
ables that depend upon Q, k, and the pulse pattern. They are defined 
below (5) with the additional constraint that a, = 1 when we consider 
finite width pulse; i.e., a pulse definitely occurs at the origin. In our 
notation, a positive phase error corresponds to the zero crossing of 
interest occurring prior to the reference. The largest pulse width we 
consider is 1.57. This avoids the necessity of considering the effect of 
the presence or absence of a following pulse on the negative-going zero 
crossing of interest. Similarly, for positive-going zero crossings we do 
not have to use special methods for considering the occurrence or non- 
occurrence of a preceding pulse. This is not a serious analytical restric- 
tion, since larger pulse widths can be handled by the machinery provided 
in Section A-4. As a practical matter in the design of a self-timed recon- 
structive repeater for operation in a long repeater chain, wider pulses 
would introduce intolerable phase jitter. In the following, we will also 
neglect the constant c in (8), since it is independent of pulse pattern 
and can in principle be compensated for in either the timing path or 
information-bearing path in a self-timed reconstructive repeater. 


6 


III. PROBABILITY DENSITY FOR THE PHASE ERROR 


3.1 Prelaminaries 


Irom the above, the random variable of interest: is 


pe i (9) 


atb a 
To determine the probability density p(@) or the cumulative distribu- 
tion F(@), we consider the joint probability density of the correlated 
random variables a, and y, , p(21, 41). /(@) = Pr (yi/21) S 6), which 
may be written 





Ox, 


eo} 0 ioe) 
F(@) -| day dyyp (41,91) + i ar: | dy:p(a1, 41). 
0 oO — 00 zy 
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Differentiation of /(@) with respect to @ plus rearrangement yields 
po) = | ayp (21, 0x1) day + | aip(—21, —02,) dt. (10) 
0 


Therefore if p(a1 , y:) 1s known, p(@) can be determined by integration. 

As is typical of this class of problems when 2; and y, are not correlated 

normal variables, the exact determination of p(a,, y,) is rarely obtain- 

able. Therefore, we find it essential to proceed along approximate lines. 
We can write the characteristic function g(u,v) for p(a1, y1) as 


g(up) = [ dx, i dyer TY) (x1, 1) . (11) 
If we take the partial derivative of (11) with respect to u, evaluate it 
at u = —6v, divide both sides by 277, and integrate over v from — « to 
co, we get 
1 = dp (uv) 





1 e _ i tv (yy —Ox 
ae ae dy = 5 f do fen | dome pla 0). 


u= —Ov 





When we interchange the order of integration to integrate over v first, 


1 f° éo(up) 





u= —Ov 


where 6(y; — 6x,) is the Dirac delta function. Integration over y, then 
results in 


ine) 


1 de(u,v) dv — | xip(a1, 621) dx 


271 Joo Ou 








u=—Oy 


, 4 (12) 
= | xip (a1, 6x1) da, = / ap(—21, — 6x1) dx1. 
0 0 


A comparison of (10) with (12) reveals that they are equal provided 
that 2, is always positive, in which case p(—2a, —62x,) 1s zero. Under 
this condition” 


ee! [ de (uv) 
PO => lean 


In the following we will use (13) to approximate p(@); before doing 
so we make a few remarks about the range of the random variables 2; 
and 6. 


dv. (13)* 


u= —Ov 





* The result in (13) is given as an exercise for the reader on p. 317 of Ref. 9. 
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3.2 Minimum Values of x, and y1 


Our comments in this section will largely be confined to the case of 
impulse excitation in which case x; = x and y, = y, where x and y are 
defined following (5). From the definition of x it can be seen that it 
attains its minimum value for the set of a, = 1 in which the argument 
of cos 2xkn is in the second and third quadrants (modulo 27). With this 
pulse pattern it is easily shown that 


min (1 — e- (rk) (1 — 28 cos 2xk + 6) y (i — e=@ Rta) 


where 6 = ¢ and g = average value of y (from Appendix D). 


For the values of k and Q that we consider, namely kQ less than about 
0.1 and Q = 100, an excellent approximation for min is 


min = —2ge ro 
When kQ is fixed at 0.1, 
Peers k@ ent ot 
min T 


and for Q = 100, min = —0.005. The ratio tmin/, where € = average 
value of x, can be shown to be 





min , 3 ; 
a = —AkQ ¢ dlls 


which for kQ = 0.1 is —0.00016, or very close to zero. Based on un- 
published work of one of the authors, the probability of x/Z% of even going 
negative is so remote as to be completely unimportant and decreases 
with increasing Q for kQ fixed. 

Another interesting way of looking at the probability of 2 becoming 
negative is to consider the probability of pulses occurring in the first 
quadrant of the argument of cos 27kn to constrain the minimum value 
of x to zero. This can occur in any of several ways. One possibility is to 
choose a single pulse (a single a, = 1) in the sector of the first quadrant 
bounded by n = 0 and the largest integral value of n that satisfies 


8B” cos 2rkn > | amin |. 


For Q = 100 and kQ = 0.1, the above is satisfied for a value of n that 
is less than about 148. The probability of at least one pulse in this range 
of nis 1 — (1 — p)™* which is about 1 — 10°” for equally likely pulses 
and spaces. Therefore, x is positive with probability very close to unity. 
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For increasing values of Q, with kQ fixed at 0.1, the probability that 
x is >0 approaches unity even more closely. 

By an argument that parallels the above, the probability that y < 0 
for k > O and impulse excitation is very small. Similarly, probability 
y > 0 for k < 0 is extremely small. 

For raised cosine excitation, min 1s Increased by 1 + b, which for 
the pulse widths considered herein is always >0.25, thereby making 
XLmin positive for the Q’s of interest to us. We also note that long strings 
of zeros as required in attaining 2min cannot be tolerated in a PCM 
repeater with a simple tuned circuit timing extractor, since the timing 
wave amplitude would fall well below the point at which it would be 
useful in the repeater. A higher minimum on the timing wave amplitude 
can be assured by constraining the transmitted pulse train to avoid such 
long strings of spaces.’ In this paper we simulate this constraint by the 
introduction of a forced periodic pattern of pulses in the otherwise 
random train. This serves to increase %min and decrease the range of 6 
as we shall see below and in Sections VII and VIII. 


3.3 Range of 6 


For random impulse excitation, it is apparent from (5) that @ is un- 
bounded when we choose a single a, = 1 for n large and all the rest zero. 
However, with a, = 1 and the values of Q we consider, x is always 
positive, and from the results of Section 3.2 6 is essentially confined to 
(0, r/2) for k > O and [0, — (a/2)] for k < 0. In the following we seek 
tighter bounds under the practically important case a, = 1. Experi- 
mentally, a, = 1 means that we examine only those time slots containing 
pulses. 

For the general form of 0, D. Slepian and E. N. Gilbert of Bell Tele- 
phone Laboratories* have developed an algorithm for determining the 
pattern that yields the maximum value of @. Their result is particularly 
simple when kQ < 1; then we can approximate x by 


1 af >) ay gee 
1 
and y by 


ioe) 
Qnk DS an ne”, 
1 


Under this condition Gilbert and Slepian have shown that the pulse 


* Private communication. 
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Fig. 1 — n. vs B for random impulse excitation. 


pattern giving the largest value of @ is specified by all pulses present for 
n =n, and pulses absent for n < n,. The value of n, is obtained from* 
err 
(A = B)? — ne(1 i b) 
where 6 = e ‘”’®. For random impulse excitation a = 0 = b. For this 
case, n. versus 6 obtained from (14) is shown in Fig. 1. For 8 < 4, all 
pulses present (n. = 1) yields the maximum value for 6. In the range 
4 << 6 < 0.639 the pulse immediately adjacent to the origin is dropped 
out to obtain @nax and so on. 

The maximum value attained in a specified interval is achieved for 
the largest 6 in the interval and the niaximum value is given simply by 
2rk times the n, defined by the 8 interval. The @ intervals corresponding 
to constant n, get smaller and smaller as 8 approaches one. This is 
illustrated in Fig. 2, where we have plotted n, against Q rather than £, 
showing a continuous approximation to the actual staircase character- 
istic. We note that for Q = 100, n. = 80 and Omax = 2rkn, = 160zk. 
With k = 107°, @max = 0.167 radians. 


* See Appendix E for the proof. 


a 
ahs (14) 
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Fig. 2— n. vs Q. 


Yor finite width pulses, a and b are non-zero. With raised cosine pulses 
of pulse width less than 1.5 time slots a < 0.65 and b > —0.75 with the 
largest negative value of b corresponding to the consideration of positive 
going time slots. When the mistuning, k, is positive, the effect of finite 
pulse width then is to raise the maximum value of n, over the impulse 
case and consequently to raise @max. On the other hand, when k < 0, 
Omax can be reduced over the impulse case. We will demonstrate this 
effect in connection with the cumulative distribution in Section LX of 
the paper. 

As noted previously, the long string of spaces implied by large n, 
make the timing wave amplitude so small as to be useless in a real re- 
peater. The timing wave amplitude can be increased by forcing a periodic 
pulse pattern. With the constraint that every Mth pulse must occur, 
the pattern that yields the maximum value for 6 is as before where n. 
is now given by 


B” etl 


C= Be) 
Msp" (1 Ze Bp) _ rMpotr™ 
2Qrk(1 — B™)? Qrk(1 — B*)’ 


= - Stn [i tote 
2rk 
(15) 





where r is the largest integer less than n,./M/. It can be seen that (15) 


514 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


reduces to (14) as M — ~ as expected. Furthermore, since the difference 
in the last two terms of (15) is positive and the term added to 1 + 6 is 
also positive, it is apparent that the effect of the periodic pattern is to 
reduce n, and consequently @nax as expected. 


3.4 Probability Density Function, p(6) 


With the above preliminaries disposed of, we will proceed to use (13) 
to develop an approximate expression for p(@). To do this we assume 
that the logarithm of the characteristic function possesses a power 
series expansion in the neighborhood of u = 0 = v. The general form of 
this series is” 





Ars 
log g(uv) = y 2 (¢u)"(iv)* (16) 
r ty s ea 0 
where the X,s are the semi-invariants of the distribution for x, and y; . 


Since 


dp 
a ae = flog ¢], 
we may write 
= es 
p(6) = ;, Hos e] | », oP [log gl} dv. (17) 








Using (17) and performing the differentiation indicated in the integrand, 
we get 





pia) = S| b>? yas (pe |"). cs) 
tg r=0 50 v 
We now remove terms from the double summation for which r + s S 2. 
The remaining terms we treat as u, and expand e” in a power series 
retaining only the first two terms (e* ~ 1 + wu). In this case p(@) be- 
comes approximately 





r+s=6 
p(B) — pol8) + 2 A (~1)" pl), (19) 
r+s >2 


where 


d ye 
(0) = lt ie ss exp —20(A109 — Aor) — 5 (haof” = 2h rw) | 
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or p.(6) = (d/dé)f.(@), where f.(6) is defined by comparison with the 
above. 


Similarly, 
d - \rt+s 
rs 6 a” ja 
palo) = Sho Sf? Gn 
” 
“exp — tv (A409 = Nor) re my (A209 = 2110 + re) |, 
or 


pel) = 5 fa(O). 


An upper limit for the double summation in (19) is set in order to make 
the approximation for p(@) consistent with the number of terms used 
in the power series expansion for e”. The reason for 6 as an upper limit 
will become apparent when we discuss the semi-invariants, ),; , in detail 
in Section V. Performing the differentiations and integrations indicated 
in (19) we finally arrive at 











1 A,(0) A.(8)” 
P\8) ~ 7am AO)! er 340 
ae ( A,(@) 
tS Mrs = Fi@) 
7 ap ost Lao (20) 
H Ay a) 7 
cae Hewo-s (Fear) ao} 
"(PA ))O 
where 
A,(@) rd Ar0( 4 — 60), 
A,(6) ag ood” — 2h118 + oz ’ 
A2(6) —_ 1019 (Az090 a Nu) aa (X1190 a Noz)], 
A;s(6) = Shao" Series 8)Au18 — roe , 
and 
4, =. 


N10 
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The H’s are Hermite polynomials defined by 


d” 
dZ” 





H,(Z) = (-1)" (eo). 

The result in (20) gives a general expression for p(@) as a function of 
the semi-invariants of the distribution of z, and y,. The solution ob- 
tained is approximate in that it depends upon an asymptotic expansion 
analogous to the Edgeworth Series. As noted by Cramer,’ one is not 
particularly interested in whether series of this type converge or not, 
but whether a small number of terms suffice to give a good approximation 
to the probability density function over a specified range of its argu- 
ment. In our case, the statistical properties of the input pulse pattern, 
and the parameters of the timing circuit are controlling in this regard. 
With this in mind, the determination of the range in @ over which a 
valid approximation may be obtained in various cases is deferred for 
the present. 


IV. CUMULATIVE DISTRIBUTION FUNCTION 


The cumulative distribution function F(@) may be determined using 
the results derived in the preceding section. Beginning with (19) we 
may write 


(0) ~ fo'(8) + 3 pas wi (1) Fee’ (8). (21) 


pie Sy 


By definition* 
8 
r@) = | plu) du 
Integrating (21) between the limits indicated, /'(@) becomes 


F(0) = f.(8) + S pps 75 (—1)"f,5(8) +5 = (22) 


pts >2 
Referring back to (19) and performing the integration over v necessary 
to determine f,(@) and f,.(0), we get 


* The significance of the lower limit of integration in the definition of F(6) 
will be discussed in connection with the numerical results. 
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" Ao(8) | 1 


Hee rey ee ee 2 (ee 
KO) 2 a3 | Za /27A1(6) 


2 
oof = OE] EF Ns ye eos nw) = 


2A,(6) er ris! WEVE7 HC) ane 








where A.(@), Ai(@) and H,,,-; have been previously defined. 


V. SEMI-INVARIANTS FOR THE DISTRIBUTION OF Y AND 7 


In this section we consider the coefficients of the power series expan- 
sion for the logarithm of the characteristic function ¢g(u,v). These are 
determined as functions of the parameters of the timing circuit, and the 
excitation and provide the necessary information for an explicit solution 
for p(@) and F'(@). A closed form for the i,, 1s obtainable for all excita- 
tions of interest under the condition p = 3} (pulses and spaces equally 
likely). [The semi-invariants for any p can be obtained by appropriate 
differentiations of log y(u,v). We have not expended the energy for this 
exercise.| The semi-invariants are shown below for random impulse 
excitation under the condition kQ « 7 and are derived for all excitations 
we consider in Appendix D.* 


-~ _t , 7k 
~ 2(1 = 8B) "= BP 


—1)°B,..(2""* —1 Pe 1 
Ars res = Ce ee ah) dg’ ( ) (25) 


10 (24) 





1—e7 
where 8 = ¢ ‘", g = r/Q (r +s), and the B,,, are Bernoulli numbers. 
Since B,,, = 0 for r + s odd and >1, we note that the odd order semi- 
invariants given in (24) and (25) vanish beyond order 1. Therefore 
since the i,; for 7 + s = 3 are zero, one can extend the upper limit in 
the double summation in (19) to 6, and still maintain consistency with 
the fact that only 2 terms in the power series expansion for the expo- 
nential, e”, were used in the approximation for p(@). This conclusion is 
valid for all excitations of interest. 


VI. BEHAVIOR OF p(@) FOR LARGE Q 
When the Q of the resonant circuit becomes large, the past history of 
the input signal becomes increasingly important in determining the 


* The more general semi-invariants without the restriction kQ < = are given 
in Appendix D; however, they are too long to be repeated here. 
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statistical properties of x and y. This follows from the form of the ex- 
ponential term in the expressions for x and y given in (5). Invoking the 
Central Limit Theorem under this condition, one would expect the 
values of x and y to begin heaping up about their respective means with 
the probability density function p(2,y) approaching a two dimensional 
normal distribution. Analogous behavior is expected of 6 and we will 
now consider p(@) as given by (20) in the neighborhood of its mean for 
large Q. The discussion is restricted to the case of random impulse 
excitation, but the results for other excitations parallel those of this 
section. 

To determine p(@) near its mean, we write, using the previous condi- 
tion kQ Kz, 


g2% = 3 (26) 
v —an 
Sane 
n=0 
where 
ees 
0° 


For this to hold as Q becomes arbitrarily large, we require the kQ 
product to be constant. Since 


6 can also be written as 
d d x . 
6~ —2rk — [log x] = —2rk —| log = + log Z|], (27) 
da da z 


where « is the average value of x. Expanding log «/% in a power series 
in the neighborhood of 1 (a near #), and keeping only the first term, @ 
becomes 





6~ —2rk is [log #] — 2xk ¢ E = 8 (28) 
da d 


de = 
Differentiating the above with respect to a we get for 6 in the neighbor- 
hood of its mean 


ay (29) 





pine foe 
x wv 
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In determining this result we make use of the fact that 
: d ._ 
y = —2rk — [iz]. (30) 
da 


Using (29) one can determine the logarithm of the characteristic func- 
tion of 6, and the associated semi-invariants of the @ distribution. When 
this is done, the mean of @ is 


2rkB 
cae) 
which also can be derived directly from (29). The standard deviation 
and the 4th semi-invariant are given by 
2(27k)?2B? 
ChB eB)? 


§6@~n2=6= (31) 





Sr ier 


66°(1 + *)(1 — B)° 





_ —2(2nk)‘e*T, — 46°(1 — B) 
MES alex Be) it @=fy ? t= pp 
_ 4e(1 — 8)°(1 + 46° + 6°) 
(r — Bi oy 
pol 2 ee | 
(P= 6" 


with 8 = e “. These same results can be derived using (20) and including 
only the first correction term from the double sum (i.e., only those Ars 
for which r + s = 4). The details of the calculation along with the },. 
of interest are given in Appendix D. The final result for p(@) is 


6 — 4 
Hy 
p(a) ~ = wy (1 Vis) (33) 


TO 4! Ao4 








The above equation for p(@) is in the form of the standard Edgeworth 
approximation. In the limit as Q becomes large (6 — 1), and with kQ 
constant, p(@) reduces to 


- SS [1 - aah (ae) 
with 6, = 2kQ and ¢ = k+/xQ. Equation (26) indicates the approach 


to the normal law as Q becomes large with the first correction term going 
as 1/Q. The above results for @, and o correspond to those derived 








plo) ~ 
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earlier by Bennett’ by another method. If we rewrite ¢ as kQ~/z/Q we 
notice that p(@) becomes more peaked with increasing Q, and falls off 
quite rapidly as @ departs from the mean. In the high Q case the concen- 
tration about 6, becomes more pronounced as expected. 

It is to be emphasized that the general properties of p(@) for large Q 
demonstrated here will be true for the other inputs also. For example, 
with random impulse excitation plus 1 out of M pulses forced, the 
average value will remain the same as above but o will be a function of 
M; 





a M(M — 1) oh 
cng 4/7 MOTD TM ty for 77 > 1. 


The effect of M is to reduce o and therefore increase the concentration 
about the mean. As M becomes large (fewer pulses required to occur), 
the effect of M becomes insignificant for this large Q case. 


VII. NUMERICAL RESULTS FOR p(@) AND 1 — F'(@): IMPULSE EXCITATION 


7.1 p(@) 


To determine the behavior of the probability density function for 
finite Q, we must use the general form of the approximation to p(@) 
given by (20), since most of the approximations made in the previous 
section for Q arbitrarily large are no longer valid. By way of illustration 
we consider the case Q = 100, k = 10° with impulse excitation and all 
pulses random (p = 4). For negative mistuning, k = —10~°, the curve 
for p(@) will be identical with that for k positive except that @ is re- 
placed with — 6. The result for the probability density function is shown 
in Fig. 3. The calculations* upon which this curve is based include 
the first and second correction terms of (20); i.e., terms for which r + 
s=4andr-+s = 6. Points beyond 6 = 0.13 radians on the lower end 
and 6 = 0.35 radians on the upper end are not included, since the ap- 
proximation begins to fail at these extremes. More specifically, the 
probability density obtained from (20) goes negative somewhere be- 
tween 6 = 0.13 radians and @ = 0.12 radians and 6 = 0.35 and @ = 0.36 
radians. However, as we shall see later, up to these points the results 
for the cumulative distribution are in good agreement with computer 
simulation. The cumulative distribution is also shown on Fig. 3 to point 
out the fact that the median occurs slightly below the approximate mean 
given by 2kQ. In addition, it is apparent from the shape of p(@) and 


* Equation (20) and all subsequent calculations for p(@) and F(@) were pro- 
grammed for the IBM 7090 computer by Miss E. G, Cheatham, 
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Fig. 3 — p(@) and F(@) as a function of 6 for k = 10°? and Q = 100. Random 
impulse excitation. 


F (6) that the probability density is skewed in the direction of increasing 
phase error. This is more easily visualized from Fig. 4 where we have 
shown p(6) as in Fig. 3 plotted on log paper. The normal probability 
density with the same mean and variance as our computed curve is also 
shown to further illustrate the skewness. 

On Fig. 5 we have plotted p(@), as defined in (20), to illustrate the 
contribution of its constituent terms. From this figure we see that the 
principal term (always positive) predominates over most of the range. 
At the tails, the terms involving \,, for r + s = 4 pulls p@) in and 
forces the density to become negative. The last term in the approxima- 
tion, for which r + s = 6, serves to extend the region over which p(@) 
remains positive. 

When 1/M pulses are forced, the skewness is reduced, as is the vari- 
ance. There are several ways of explaining this effect. First, as discussed 
in Section 3, the denominator of @ in (8) or (9) is raised, thereby reducing 
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Fig. 4 — p(@) for k = 10-3 and Q = 100. The normal curve with the same mean 
and variance is also shown for comparison. Random impulse excitation. 


the range of variation of the timing wave amplitude and confining 6 to a 
narrower range. This is expected from the physical standpoint, since 
forcing a periodic pattern with the remaining pulses and spaces equally 
likely is similar to increasing the probability of occurrence of a pulse in 
an all-random sequence. Since the pulses, when they occur, have the 
proper spacing, they will tend to correct for the departure of the zero 
crossings from the mean that has occurred during the free response of 
the tuned circuit in the absence of a pulse. Indeed, in the limit when 
M = 1 (all pulses definitely occur), all the probability is concentrated 
at the mean, 2kQ, which is identical to the steady state phase shift of 
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the tuned circuit in response to a sine wave at the pulse repetition fre- 
quency. This behavior is also predicted mathematically from (20) and 
the fact that \,, goes to zero forr + s > 1 when M = 1. Thesame effect 
occurs when Q approaches infinity with kQ constant and it can be shown 
from the results of the previous section that p(6) goes to 6(@) when the 
limit is taken. In this light, we can view the introduction of forced pulses 
as effectively increasing the Q of the tuned circuit while maintaining kQ 
fixed. 
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Fig. 5 — Contributions of various terms involved in the p(@) approximation 
given by (20). Random impulse excitation is assumed, with k = 107% and Q = 100. 
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0.10 
Fig. 6 — The effect on p(@) of requiring 1/M impulses to occur. k = 1073, Q = 
00. 


In practical applications, the effect of a pulse at the origin is of par- 
ticular interest. Mathematically, this corresponds to M = o. Physically 
this means we examine and record phase error only for those time slots 
containing a pulse. Fig. 6 illustrates the narrowing of the density func- 
tion for M = o (pulse at the origin), and M = 16, 8, and 4. It is 
interesting to note that, for these cases, the probability density function 
remains positive over the range of @ we have used in the computations 
from 0.1 to 0.4 radian. This encompasses values of p(@) < 10‘ on the 
left of the mean and p(@) < 10” to the right of the mean. This is to be 
expected since \;; decrease with decreasing M for r + s 2 2, thereby 
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reducing the importance of the terms involving the Hermite polynomials 
in (20) and improving the approximation. 

Fig. 7 depicts the behavior of p(@) as Q grows with kQ fixed at 0.1. 
The results are consistent with the predictions of the previous section. 


7.2 1 — F(@) 


For a closer inspection of the behavior of the distribution at its tails, 
1 — F(6) will be examined. This function as evaluated from (23) for 
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Fig. 7 — The effect on p(@) of increasing Q with kQ = 0.1 and random impulse 
excitation. 
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Q = 100, k = 10°, and purely random excitation (p = 4) is shown in 
Fig. 8. The plot shown gives the probability that @ deviates from its 
mean by more than some constant C times oc. In the same figure a 
comparison of the calculated approximation with the normal curve of 
identical mean and standard deviation indicates a substantial departure 
from the normal law as the phase error increases. When periodic patterns 
are interspersed with the random train, the departure from the mean is 
further reduced, as can be seen from Fig. 9. Similar behavior is exhibited 


in Fig. 10, where Q is increased from 100 to 500 and kQ maintained 
constant at 0.1. 
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Fig. 8 — Comparison of 1 — F(@) with the normal curve in the vicinity of the 
tails. The normal curve is computed assuming the same mean and variance used 
in determining 1 — F(@). Random impulse excitation with Q = 100 and k = 10-3 
is assumed for computing 1 — F (6). 
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Q Fig. 9 — The effect on 1 — F(@) of requiring 1/M impulses to occur. k = 10-3, 
= 100. 


7.3 Comparison with other approaches 


Since we have made approximations in arriving at our expression for 
the phase error, it is natural to ask how these approximations affect our 
computed results. A comparison of our results with two other approaches 
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Fig. 10 — The effect on 1 — F(6) of increasing Q with kQ = 0.1 and random 
impulse excitation. 


will be made for the case of impulse excitation. We recall from Section 2 
that the phase error under impulse excitation is given by 
tang = 2. 
a 


For kQ sufficiently small we can write 
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ioe) 
nr 
na 
6 a 2d, nb 
2Qak s . 


» a,8" 


n=0 


(35) 





The approximation of tan @ by its argument is not crucial in this case, 
since a straightforward transformation can be made on the probability 
distribution to correct for this approximation [i.e., p(@) = sec’ @p(tan 6)]. 

H. Martens* shows that (35) can be manipulated to yield a recursion 
relationship for the phase error that is in a convenient form for digital 
computer evaluation. T. V. Crater and 8S. O. Rice used this approach in 
some of their work, and a probability distribution so determined is 
shown by the dots in Fig. 11 for Q = 125. For the same value of Q, we 
have computed the probability distribution from the series in (23), and 
it is displayed as the solid curve of Fig. 11. It can be seen that the agree- 
ment between the two approaches is excellent. The scattering of the 
“experimental” points at the 10° level and below is due to the limited 
number of pulse positions considered by Crater and Rice. Specifically, 
10* pulse positions were processed after an initial transient of some 
5 X 10° pulse positions had elapsed. 

In addition, 8. O. Rice in unpublished work has shown that the tail 
of the distribution should behave as A(4)*”™*, where A is an unknown 
constant. When we take the values of @ at the 10 and 10 * levels and 
substitute these in Rice’s asymptotic form and form a ratio, the con- 
stant A cancels out and we should obtain 10. The actual value for the 
ratio is 10.9, which tends to indicate that the asymptotic behavior has 
virtually been reached. This suggests that an extrapolation of the distri- 
bution to larger values of @ by merely continuing with the same slope 
should be valid. : 

We also note that we can write 


ee ace a? (yale 
5 _ 


where we have made use of 6, = 2kQ. With kQ constant, one would 
expect the cumulative probability to fall off faster for larger Q, as is 
indeed the case. The slopes of the curves of Fig. 10 follow Rice’s pre- 
dictions quite closely. 

While the above comparisons are comforting, they only indicate that 
our final expressions for p(@) and F'(6) are accurate for computing these 
quantities from the initial defining equation for 6. Approximations have 
been made in arriving at the starting relationship. A check on these 
initial approximations may be obtained from a simulation of the problem. 


* Unpublished memorandum. 
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Fig. 11 — Comparison of 1 — F(6) computed by (23) with the results of the 
Crater-Rice simulation for Q = 125. Random impulse excitation is assumed. 


One such simulation has been accomplished by Miss M. R. Branower 
using a combination of analogue and digital computers. The principal 
errors introduced in this process involve the stability of the analogue 
computer with time and the number of pulses processed. For a tuned 
circuit characterized by a Q of 125 and mistuning k = +10’, the 
computer simulation yields the results of Fig. 12. Results obtained using 
(23), the exact semi-invariants of Appendix C, and the tan @ transforma- 
tion mentioned previously yield the “computed curve” of Fig. 12. 
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Again the results are in very close agreement. To indicate the effect of 
the approximation kQ <«< 7, we have repeated the computed curve of 
Tig. 11 on Fig. 12. 


VUI. RAISED COSINE EXCITATION 


3.1 Results for 1 — F(@) 


With raised cosine excitation, the computations are performed as before 
and only the semi-invariants \,. for r + s = 1 are changed from the 





Q=125, k=1073 


© ANALOG SIMULATION 
DATA DUE TO 
M. R. BRANOWER 











OMPUTED FROM (23) 
TAN @ CORRECTION 
NOT INCLUDED); 
APPROXIMATE 
SEMI- INVARIANTS 














PROB >ABSCISSA 








4 
fA 
COMPUTED FROM (23) 
(TAN @ CORRECTION 
INCLUDED) 
































8 IN RADIANS 


Fig. 12 — Comparison of 1 — F(@) computed by (23) with the results of an 
analog simulation due to M. R. Branower. Random impulse excitation with 
Q = 125 and k = 107% is assumed. The effect of the tan 6 approximation is shown 
together with results for both approximate and exact semi-variants. 
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Fig. 13 — Plot of 1 — F(6) for raised cosine excitation. Pulses of width 7' and 
1.57 are assumed in the calculation. The distribution of the phase error for both 
positive and negative-going zero crossings is shown. Q = 100, k = 107%. 


previous case. Results obtained for this excitation are shown on Fig. 13, 
where it is apparent that the use of widest pulses and positive-going zero 
crossings yields the largest phase error. The effect of Q and with this 
typeof input is the same as with impulses. ; 
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8.2 Comparison with another approach when k = O 


In the absence of mistuning, the phase error becomes 


9 (36) 


St 
at’ 


and the probability distribution for 6 may be obtained by methods given 
previously, or by the following relationship: 


IV 


a+b 


= Prob (2s °—*). 


Prob (6 =) = Prob (5 s) 
(37) 





Therefore, if the distribution for x is known, the distribution for 6 may 
be determined from it. The random variable x is the normalized timing 
wave amplitude defined by Rowe. This random variable has been con- 
sidered by 8. O. Rice in unpublished work and he has developed a pro- 
cedure for closely approximating its probability distribution. Using the 
method of moments, one of the authors also computed this distribution. 
The results were in excellent agreement with Rice’s results and the 
cumulative distribution obtained by the moment method is shown in 
Fig. 14. It can be shown that the probability density for x is unimodal 
and symmetric about its mean; therefore, the data on Fig. 14 suffices to 
specify the complete distribution. With this data and (87) we can 
determine the distribution for 6. Alternately, we can use (23) to make 
this computation. A comparison of the distribution obtained by the two 
approaches is shown in Fig. 15 and it can be seen that the agreement is 
very close. Thus we have found another check on our series approxima- 
tion for p(@). Conversely, we can use the distribution for 6 to compute 
the distribution for x. In this regard it is interesting to note that when 
the Edgeworth expansion including semi-invariants through order 6 is 
used to approximate the distribution for x, the density function begins 
to turn negative in the neighborhood of 30 from the mean indicating 
failure of the approximation. On the other hand, using the same number 
of semi-invariants in the expansion for p(@), where 6 in this case is es- 
sentially the reciprocal of x, we obtain a good approximation to the 
cumulative distribution for x. This is believed to be due to the narrowness 
of the range of 6 as compared with 2; i.e., x variesfrom 1 to1/(1 — 6) = 
Q/z, while 1/x goes from 1 — 8 = 7/Q to 1. 
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Fig. 14 — Probability distribution of the timing wave amplitude. Q = 100. 
IX. OPTIMUM TUNING — FINITE PULSE WIDTH 


In the case of impulse excitation it should be apparent that zero mis- 
tuning, k = 0, is the desired objective for no phase error. On the other 
hand, with finite width pulses zero mistuning does not yield zero phase 
error. Mistuning can be purposely introduced in the finite pulse width 
case to make the mean value of @ zero, to minimize the variance of 6, or 
to. optimize some other parameter of the @ distribution. 

An approximation to making the mean of @ zero may be obtained by 
choosing k such that the average value of the numerator of @ is zero. 
This means that 
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Gon! — 


f=atyrartr 
or 


_ _a(l — B)’ 
ho= a aa : (39) 


For example, when Q = 100 and a = 0.65, as for raised cosine pulses of 
width 1.57, then k = —2.05 X 10“ to satisfy (39). In the high Q case 
(39) becomes k = —(ar/Q’). 
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Fig. 15 — Comparison of the distribution of 6 as computed by (23) and that 
determined from the distribution of the timing wave amplitude of Fig. 14. Raised 
cosine pulses of width 1.5T drive a tuned circuit with a Q = 100 and zero mistun- 
ing. eee deviations in the neighborhood of negative-going zero crossings are 
considere 
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When the objective is to minimize the variance of 6, we consider ¢ 
as defined in Appendix D; i.e. 


x | Ar0Bo! = 2dn166 + Dow) 
Aro ; 





(40) 


A plot of ¢ versus k is shown in Tig. 16, where it is seen that the minimum 
« occurs close to the “zero mean” value of k. Probability distributions 
for values.of & that encompass the optimum are shown on Fig. 17. The 
narrowing of the density function for the optimum value of & is evident. 

The results of this section suggest that when the tuned circuit in a 
self-timed repeater is adjusted, it should be excited with a random pulse 
train and the tuning adjusted to minimize the jitter on the leading edge 
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Fig. 16 — Standard deviation of phase error as a function of mistuning with 
raised cosine pulses 1.57 wide. Negative-going zero crossings are considered. Q = 
100. 
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Fig. 17 — p(@) for raised cosine excitation with various mistunings in the 
neighborhood of the optimum mistuning. Negative-going zero crossings and 
pulses 1.57’ wide are assumed in making the calculations. Q = 100. 


of the output pulse train as viewed, for example, on an oscilloscope. This 
is the method used for the adjustment of the repeater of Ref. 8. 


X. PARTIAL RETIMING 


In Section VIIT we have shown that, in the absence of mistuning, the 
variable @ can be related to the normalized timing wave amplitude x 
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and the distribution for @ determined from the distribution for x. Here 
we will also make use of the distribution for x in order to analyze an 
idealized version of a forward-acting partial retiming scheme. The 
scheme we consider has been described by E. D. Sunde’ and analyzed 
for periodic pulse patterns in Ref. 7. We make the same assumptions 
here as in the later reference, namely 
1. The pulses exciting the tuned circuit are so narrow that they can 
be considered impulses. They are obtained by processing incoming 
pulses to the repeater and they excite a simple tuned circuit. 
2. The timing wave is so clamped that its maximum excursion is at 
ground. 
3. Reconstruction of the raised cosine pulse takes place when the 
algebraic sum of the timing wave and the raised cosine pulse crosses 
a threshold assumed to be at half the peak pulse amplitude. 
Tor random impulse excitation of the tuned circuit prior to t = 0 and 
the definite occurrence of a pulse at ¢ = 0, we have, according to the 
above assumptions (with no pulse overlap) 


1 2rts x Qrt i 
§(1 + cos 4) — 2 (1 — cos *#) =} (41) 
for |t| < 7/2s 
where 
oS 2d a,8”, 
a, = 1 (the pulse at the origin definitely occurs), 
and 


& = average value of 2. 


Equation (41) is based on the assumption that the average timing wave 
has a peak-to-peak amplitude equal to the peak pulse height (1.e., when 
x = &, the timing wave amplitude varies between —1 and 0). If we 
define ¢, as the time at which regeneration takes place and 6, = 27t,/T 
as the corresponding phase angle, then it can be seen from (41) that 
this phase is a random variable dependent upon the random variable «. 
We will solve for 6, under the condition s = 1, which means that the 
information-bearing pulses are resolved.* Under this condition — (1/2) 
< 6, <0. Consistent with our previous definition of phase error, we will 
consider the negative of 6, , since this makes the phase error positive 


* Other pulse widths and different ratios of average timing wave amplitude to 
pulse peak can be handled, but we will not consider them here. 
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when we take our reference as the phase corresponding to the time at 
which the pulse peak occurs (at ¢ = 0). In this way a positive phase 
error corresponds to regeneration prior to the pulse peak and permits 
direct comparison with the results of section 8 for the complete retiming 
approach. Solving (41) for cos 6, gives 








a 
COs 6, = ad (42) 
1+5 
x 
and 
Prob (cos 6, < X) = Prob (@, = cos A) 
x 43: 
z = 43) 
= Prob Z -<S) = Prob (zs 7¥,), 
1 +2 (1 — A) 


It is apparent from the above that we can use the distribution for x to 
determine the distribution for @,. For Q = 100, the distribution for x 
is shown in Fig. 14 and with (43) enables us to obtain the distribution 
for 6, as shown in Fig. 18. When we compare this result with that of Fig. 
15, which shows 1 — F(@) for the case of complete retiming, it is ap- 
parent that partial retiming results in a considerably larger variation of 
phase error. This supports the contention made in Ref. 7. 


XI. CONCLUSIONS AND FUTURE WORK 


We have derived an approximate relationship for the probability 
density and cumulative distribution for the phase error at the output of a 
tuned circuit when it is excited by a random or random plus periodic 
pulse train. The effects of mistuning of the tuned circuit and the finite 
widths of the driving pulses have been considered. Three independent 
checks of our results indicate that the expressions given are excellent 
approximations to the true state of affairs for kQ < 0.1 and Q > 100. 
Regions defined by these limits encompass values of k and Q of interest 
in PCM systems under consideration. 

More specifically, we have shown that the distributions are not normal 
and are skewed in the direction of increasing phase error. When we 
consider pulse positions in which a pulse definitely occurs, it has been 
shown that the maximum phase error is bounded. In addition with 
raised cosine excitation we have demonstrated that the mistuning can 
be adjusted to minimize the mean or variance of the distribution for the 
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Fig. 18 — Distribution of the phase error with partial retiming. Q = 100 and 
k = 0. Raised cosine excitation pulse width = 7’. 


phase error. The performance of an idealized version of a forward-acting 
partial retiming scheme has been analyzed and shown to be considerably 
inferior to a completely retimed repeater. 

There are several desirable directions to proceed from our present 
position. First, it appears to be possible, in the case where we examine 
pulses only, to start from the maximum value of @ and work back toward 
the mean to better approximate the distribution near the tails. $8. O. Rice 
has used this approach in related problems with success. Second, it is of 
interest to determine the pattern to give the maximum phase error at 
the output of a string of repeaters. This is not necessarily the pattern 
that creates @max in a single repeater. In this regard, we have concen- 
trated on only a single repeater. Obviously it is of interest to extend our 
results to a repeater string. This extension remains elusive. 


a 
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APPENDIX A. DERIVATION OF EQUATION FOR NORMALIZED TIMING ERROR 


A-1. Response of tuned circuit to random pulse train 


The impulse response of a parallel resonant circuit is well known to be 


= 1 J —(r/Q)Sot ,+i2rfot 
h(t) = Real part of E (1 + 30 € e | (44) 


Following Rowe,” we will imply the real part in all subsequent. calcula- 
tions involving complex quantities. The pulse train applied to the tuned 
circuit is given by 


rit) = Cag(t — xP), (45) 


where: 
a, = 1 with probability p, 
a, = 0 with probability 1 — p, and 


g(t) = pulse shape representing the binary lL. 


The response of the tuned circuit to r(¢) is 


2(t) = [. r(r)h(t — r) dr (46) 


In view of (45), this can be written 


a(t) = T 3S a,h(t — nT) 


(t/T)—n 
f g(«T) exp é& — j2xfoP) | dx. 


(47) 
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Define 
1+h 

fo = 7 : a fl aie k), (48) 
with k = fractional mistuning from the pulse repetition frequency. 
Equation (47) can be manipulated to yield 

z(t) = | A(t) feeereeet (49) 
where 
®(t) = tan” oF + 2Qxf,kt 


> Un errearonl _ G — ) sin 2rkn 
+ I» G — ) cos Dri (50) 


ae E (s — n) cos 2rkn 


+ TI, G - n) sin 2 


I 


9 


(t/T)—n 
= Im | g(xT) exp | (Ser —~ j2fsP) | dx. 
In (49), | A(t) | represents the amplitude modulation on the carrier, 
while ®(¢) represents the phase modulation, the quantity of primary 
interest here. 


t/T)—n 
Re [ g(xT’) exp | (Rar — Pret) | dx, and 
(51) 


A-2, Equation for normalized timing error 


There is no loss in generality and it is convenient if the timing error 
is evaluated in the neighborhood of the pulse that occurs for n = 0. 
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In this neighborhood, negative-going zero crossings occur where * 


Qaf,t + &(t) = 5 


or 
t 1 &(t) 
a 052) 
Similarly, positive-going zero crossings occur for 
t 1 ®t) 
ny a oe 


In the absence of tuning error, and with impulse excitation, 6 = 0 
and the negative and positive-going zero crossings occur close to +7'7/4 
respectively.* Using these zero crossings as a reference, it is easily, seen 
that the equations for normalized timing error become 


1 €1 ; 
a *G+9) (54) 
T 20 
for negative-going zero crossings and 
cg (14% 
( ao 2) (55) 


for positive-going zero crossings. 

With the exception of the minor generalization to arbitrary pulse 
shape, the method employed thus far is identical with that used by 
Rowe.’ At this point in the evaluation of the timing error, we depart 
from his approximate solutions of (54) and (55) and attempt other 
approaches. Before proceeding in this direction, an indication of the 
approximation used by Rowe will be given. For the high Q case, ® will 
be small and will change only a small amount for small changes in 27f;t. 
Based on this assumption, 





= | 3 
bo 
S) 


(56) 


d 
we 
‘2 
oo 
aL 
ae 





Ke in (50) 


* Neglecting tan 20 
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It should be pointed out that these initial approximations are good for 
Rowe’s purposes (steady-state error for 1/M patterns). However, for 
our purposes they need to be improved. 


A-3. Approximate solution of equation for normalized timing error 


One method for improving the accuracy of the initial approximation 
is to expand © in a power series about 7'/4 for negative-going zero cross- 
ings and retain two terms in the expansion to get 


C1 (4) 


a A 
T on + #3)’ a 
The form of ® makes this approach messy and makes the determination 
of the probability distribution more difficult. 

Another approach that is more tractable involves the separate Taylor 
expansion of J; and J, (51) in ® about the reference time. If we retain 
only the first two terms in the Taylor expansion, replace the arctangent 
by its argument, and neglect /& with respect to unity, we obtain for 
negative-going zero crossings 


an 1 k 


T 47Q 4 


YS ane?” [—sin 2rkn (1(2 ~— n) + eqh’(i — n) 
~ 1 + cos Qrkn(Io(4 —n) + ex]y' (4 —n))] (58) 
Dee 
: SS ane "(cos Qakn (L1(4 — n) + ehy'(4 — n)) 


+ sin 2rkn (J2(4 — n) + erly’ (4 —n))] 


If terms in (e;/7')” are neglected, multiplication of both sides of (58) 
by the long denominator on the right results in a linear equation for 
e,/T. This equation is applicable to arbitrary pulse shape, time-limited 
or not, and has been applied by one of the authors to periodic patterns 
of both Gaussian and raised cosine pulses in unpublished work. The 
results were compared with digital computer simulation and were in 
excellent agreement, thereby giving us confidence in using this approach 
for random pulse patterns. In this paper, we will concentrate on raised 
cosine pulses. This enables us to make use of some of the results given 
by H. E. Rowe in Section 2.5 of his paper.” For these time-limited pulses, 
the limits of integration on the J’s of (51) are modified in an obvious 
way, and the upper limit on the sum over n is limited to the pulse im- 
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mediately succeeding the time slot of interest at n = 0 for negative- 
going zero crossings. The evaluation. of the various I/’s required is dis- 
cussed in Appendix B. 

Subject to the above conditions, the normalized timing error, as de- 
rived in Appendix B, can be written in the following form: 


é Ay+Be+C 


OW fez gon ee 59 
T Dy+fet+F’ (59) 
where 
y = >> ane 7” sin 2rkn, 
n=0 
eo (60) 
t= >> ae “'" cos 2xkn, 


I 
oO 


n 


and a, = 1 (a pulse definitely occurs for n = 0). A through F are de- 
fined in Appendix B and are functions of the pulse width and Q and 
mistuning of the tuned circuit. In addition, C and F are functions of the 
presence or absence of a pulse in the succeeding time slot for negative- 
going zero crossings if sufficient pulse overlap exists. For positive-going 
zero crossings the form of the equation for the normalized timing error 
is the same and the new C and F are dependent upon the presence or 
absence of a pulse in the preceding time slot. This assumes that the 
pulse width is less than 2.57’. 


A-4. Modification of probability distributions for pulse overlaps 


With the dependence on the occurrence of a succeeding pulse, as is 
the case for negative-going zero crossings with sufficient pulse overlap, 
we must modify the determination of the probability. distribution as 
given in the main body of the paper. If we denote e,/7 and C = C1, 
F = F, for a; = 1 (a succeeding pulse definitely occurs), and denote 
é»/T and C = C,, F = F. for a, = 0, then the average probability dis- 
tribution for the timing deviation will be given by 


ae ee Cu < = O12 < 
Prob (3 = ») p Prob (3 < ») + (1 — p) Prob (4 < a). (61) 
When the pulse width is less than 1.57, Ci = C., F, = F., and there- 


fore é = 2 and the above modification is not required. A similar pro- 
cedure is applicable for positive-going zero crossings. 
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APPENDIX B. RAISED COSINE PULSES 


B-1. Determination of I’s 


For a raised cosine pulse centered at the origin and of width 7'/s, I 
of equation (51) becomes 


i 
I(x) = / (1 + cos 2rsa,)el OP k1gy, ee es i (62) 
(1/28) 2s 
1 1 
I(a) =I (3) x> - 
where 
K=(1+ hk) 
The integral in '(62) is readily evaluated to give 
glinl@)—al ke oe g rte ele 
30 
" p Pell@-eelke ot irez 9 U(a/Q)— Pel K/2s 
4 (63) 
K—-—s)t+ 
( J+tIi5 - 
1 el (r/@)— Ral Ke ge ald 
re 
TV 


(K+) +555 


The derivatives required in the evaluation of (58) may be obtained 
from 


di Sie = = = 
re a piti@ike le jinKz 26 1 e j2mr(K—s)x ao 4 e peer. (64) 


In the evaluation of J and dI/dx, mistuning makes very little differ- 
ence for the allowable values in practical systems. Therefore, with K = 1 


lee ee E + cos = (65) 


Dieses yer E + cos z| (66) 
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i E + cos ars | (67) 
t - 71/4Q 37s 
I lee 3/4 = 7 € E + cos aes | - (68) 


Equations (65) and (68) above are required for negative-going zero 
crossings, while (66) and (67) are needed for positive-going zero cross- 
ings. 

B-2. Equation for Normalized Timing Error with Raised Cosine Pulses 


From (58) we can write the equation for normalized timing error as 


Qo. i} k IWN 


T —dnQ 4 De P 
where N and P are defined by comparison with (58). Cross multiplica- 
tion by P, neglecting terms in e,” and collecting terms, yields 


q AytBet+C 
T Dy +Ee+ FPF’ 70) 


where x and y are defined by (60), and A through / are as follows: 


4=—o (3) + aot 14) 

B= 3 (3) — [aca + a] (3) 

-eL*@) -*(@)]-[ao + ]l@ -*@)) 
4 ay ot!) i sin eh (-2) — Gos QakT, (-2)| 
[dy ov (-) +a} 

vn) 

real 

ran()1(d)Leete ode () 


(69) 


| 


Q 
I 


548 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


+ ay {eo Qark E iy (-2) +1; (-?) 
1 k\ ,,( 3 ; t ivf 3 
+ (4 +. *) I; (-7)| + sin 27k | - 4 i; (-?) 
3 1 k; 3 
+s (-2) + (4 trot Si (-2) |}. 


For positive-going zero crossings, only the constants C and F are 
changed. 
B-3. Numerical Evaluation of Constants 

In order to make use of some of Rowe’s results, we will choose the 
same two cases for pulse width that he used. 

Case 1. s = 1, Pulses Resolved c 


a. Negative-Going Zero Crossings. Since mistuning has a small effect 
on the evaluation of the J/’s, we neglect it in this regard. Neglecting 
terms in 1/Q’ and k/Q, after some arithmetic one arrives at 


l ‘1 k ua 0.0316 
a aa Geo - s) # epee ~O_ + 0.0085/ es 
7 3 1 0.06 
tong’ ta aoe 


Q > 50 and kQ < 0.2 encompass values of practical interest. In this 
region the term in y in the denominator of (71) can be neglected and 
the numerator term 0.0085k is also negligible. It is also convenient to 
deal with phase error rather than timing error. Therefore, we rewrite 
(71) as 








: set i 
1= —- 7 > nie c 372) 


The multiplication by —2z7 is used to avoid any questions later on as 
to which way certain inequalities are to be taken. This means that 6 is 
the negative of the phase error as previously defined. A positive value 
of 6 signifies that the zero crossing occurs prior to +7'/4 for negative 
going and positive going zero crossings respectively. The general form 
of 6 for all the cases to be considered herein then can be written as 
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6 area +c. (73) 
For the situation under consideration in this section, 
0.334 , @ 
= 0.159 + —— + =k 
a 0 + 0 + 3 } 
0.12 
b = —0,25: + —— 
Q 
Tv 


1/1 
c= - 4 [tH Fr}, 


b. Positive-Going Zero Crossings. Proceeding in the same way as in 
Sections B-2 and B-3 above, the phase error for positive-going zero 
crossings is as in (73) with 


0.2 3r 
a = 0.159 —— = 
Q 1 8) 
0.62 
b = —0.75 — 
eer 


lil T 
ce -i lt _ = Ka |. 


In this case it should be noted that with zero mistuning (y = 0) and 
with a pulse for n = 0 and nowhere else, a positive-going zero crossing 
does not occur in the neighborhood of —7'/4. Under this special con- 
dition, x = 1 and (73) with the constants of this section would predict 
an incorrect error in the positive-going zero crossing. Of course such a 
sparse pattern occurs with probability zero. Fortunately, for all other 
more reasonable periodic patterns, results obtained from (73) are in 
good agreement with computer simulation. 


Case 2. s = 4, Pulses Overlapping, Base Width = 1.5T 


a. Negative-Going Zero Crossings. In this section we will dispense 
with all of the algebra and arithmetic and simply write down the final 
results. For the case at hand 


0.255 1 


= 1 
on 4 Q 


Q 


- (0.034 — 0.02kQ) 
0.2552 — 0.062 + 0.048/Q ; 


(0.073 — 0.064kQ)x + 0.0264 + 
(74) 


es | 
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When this is converted to the form of (73), we have 


a = 0.65 + O — 0.21k 


0.188 
c=-—  i18 — 1.58kQ] 


b. Positive-Going Zero Crossings 


0.65 — ee + 0.94k 


a= 0 
1.66 
b = —0.753 panel 
ars 

Pee is S1sKol 

1 (18 — 1.58%) 


The remarks made in connection with positive-going zero crossings 
for Case 1 are equally applicable here. 


APPENDIX C. SEMI-INVARIANTS FOR THE JOINT DENSITY FUNCTION OF 
X1 AND Yt 


C-1. One out of M pulses definitely occur; the remaining pulses are in- 
dependent and occur with probability 3; raised cosine pulses. 


The characteristic function is defined as 
g(uv) = H exp i(ua + vy), (75) 


where F£ is the expectation operator, and from Appendix B 


a =>, &™™ cos 2akMm +b+ D> ane cos Qrkn, 

m=0 n~mM 

j : (76) 
yn = > 6 sin QWekMm +a+ D> ane sin 2rkn, 

m=0 nx~mM 


with a = 7/Q. Substituting (76) in (75) and performing the expecta- 
tion operation gives 
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fea) 
g(uy) = expi >) ¢ /?™*™ (u cos 2akMm + v sin 2akMm) 
m=—0 


-exp i(ub + va) X [] expt e "(yu cosQakn + vsin 2rin)} (77) 
nm M 
oo | egal 
x J] cos| — (u cos 2rkn + v sin 2rin)} 


nemM 





which may be rearranged to 


guy) = exp] § a {-eroar (u cos 2akMn + vsin 2rkMn) 


n=0 \ 


+ @ 2" (y eos Qakn + v sin 2rien)} | exp 7(ub + va) 








oo eg ian \ (78) 
II cos 5 (u cos 2rkn + sin 2rkn) ; 
n=0 
x oo jenn oe 
II cos (u cos 2rkMn + v sin 2akitn)| 


( 
When we take the logarithm of (78), we obtain 


ll 
° 


n 


log g(uv) = 5 2 [e""(u cos 2rkMn + v sin 2rkMn) 
+ B(u cos 2rkn + v sin 2rkn)] 


+ i(ua + vd) + dX log cos E (u cos 2akn + v sin 2riin) | (79) 


2 





— >> log cos E 
n=0 


—(m/Q) 


(u cos 2rkMn + v sin 2rkMn) | ; 


where 8 = e 


The first sum in (79) may be carried out, and when combined with 
t(ua + vb) yields the semi-invariants \19 and Xo which are of course 
the mean values for x, and y, respectively. Since the last two terms of 
(79) are similar in form, we will confine our manipulations to the next 
to the last term. We denote this term by 


F(up) = a log cos E (u cos 2rkn + v sin 2 | . (80) 


552 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


Using the infinite product expansion for the cosine and the power series 
expansion for the log; i.e., 


and 


0 


log (1-2) = — DE (z” < 1). 


j= 


F'(u,v) becomes 


a 


where C,, = e “” cos 2xkn and S, = e *" sin 2rkn. The sum over j may 
be obtained by virtue of 


> 1 _ (2% — 1)(—1)7*(2r)"Bay 
amr (2m -+ 1)” 2+1(27)! 


where the B:; are the Bernoulli numbers. With the above sum over m 
and the expansion of (uC, -+ vS,)”’ in a binomial series, we arrive at 


_ Se (=1)'B,(2” — 1) °) y ryt (@ Bit 
F(uyv) = = —~35(a7)1 Da a C,"u' (S,v) (82) 
Proceeding in the same manner that took us from (80) to (82), it can 
be verified that the last term of (79) takes the same form as the right- 
hand side of (82) with n replaced by nM. These results and comparison 
with the definition of the semi-invariants for a two dimensional dis- 
tribution” lead to the following for the semi-invariants for the process 
under consideration: 


al 1 — B cos 2rk 
io = 





oe 1 — B” cos 2rkM | se, 
1 — 28 cos2xk + 6B 1 — 28” cos 2xkM + B* 


2 
1 I 8 sin 2xb 
2 





6” sin 2rkM | 
ho = 51 — 26 cos Ink + * 1 — 26 cos IakM + we | to 
and 
Re Dead = 1 i) , ; 
Arslrts>1 = oe eG. => Cum Sam }. (83) 
r + § n=0 


The sum over n can be shown to be a geometric series multiplied by 
two finite series if the sines and cosines in S and C respectively are rep- 
resented in exponential form and use is made of the binomial expansion. 
After some algebra, an alternate form for (83) can be shown to be 
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_ Boo 1718! 
Reales = ~(r + 8)2"(21)* G(7,s,8, k AT), (84) 


where G(7r,s,8,k,/7) is (shortened to G) 


7 r s (—1)! 
G= 2 ha@opieog! 
1 


| EES oe (85) 


1 
T= BEF) exp (22akM(r + s — 2p — am 


For u and v in the neighborhood of zero, the contributions to the series 
in (79) become smaller as n becomes larger. The importance of suc- 
cessive terms is judged by the exponential decay factor e ‘"”. If we 
consider all terms up to some Mmax Where Mmax >> Q/m and knmax K 1, 
then we arrive at the following inequality 


IQ & Ts (86) 
Tv 


Under the above condition cos 2rkn can be replaced by unity and sin 
2rkn by 2rkn for all terms of importance in the series and (79) becomes 
approximately 


uf 1 1 B Mp” 
log ot) ~ ia tre tol git ae am 
+ (ub + ‘ no S log cos 5 (u + 2akno)| (87) 
n=0 


co Mn 
— >) log cos ee (u + Dakine). 
n=0 


Paralleling the operations performed on (80) to obtain (82) it can 
be shown that the semi-invariants obtained from (87) under the con- 
dition (86) are 


awe ix) + 











sie = oe ‘ 
= Se 8 By43(2" ue ae d° sf 1 oo 1 : 
Neelehesi pos ( 1) (r + s) (2Qrk)° dg? 1 =A eg 1 = e-Ma 5] 


with g = (r + s)r/Q. 
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C-2. Same as I Above Except That Pulses are Impulses 


For this case the semi-invariants are as above with a = 0 = b. 


C-3. Impulse Excitation, All Pulses Random 
With this type of excitation, we have 


. =| 1 — B cos 2rk | 
e211 — 28 cos Qrk + B2]’ 


‘ =3{ 8 sin 2rk | 
"211 — 26 cos Qrk + B2]’ 





and 


ie ene ee 3 ee a 
rs|r¢s>1 (r + s)2"(2i)s pi(r — p)!ql(s — q)! 


o 


p=0 q= 


(89) 


1 
‘t= B's exp [i2rk(r + s — 2p — a 


It is readily shown in this case that the approximate semi-invariants 
[subject to (86)] are 


1 1 
2-14) 


_ «ke 
Nor = (1 — By (90) 


2 Brys(2"* — 1 ma 1 
Arelrps>1 = (—1)° : ag (2k) dg (3), 





with g = (r + s)r/Q. 
APPENDIX D 


High Q Behavior of p(@) 


To illustrate the behavior of the probability density function when 
the Q of the resonator becomes large, we consider p(@) in the neighbor- 
hood of the mean, @,. We include terms of the double summation in 
(19) for which r + s = 4. Since the Bernoulli numbers B,,, = 0 for 
r + sodd and >1, the terms 2,, for r + s = 3 are zero. For 6 ~ 6, 
therefore, p(@) becomes 
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p(0) + 
AJ Qe (2090? — 21190 -F Aoz)? 
dio (8 — @)° 
rae 2 Om = hte ou) (91) 

H ( A10(9 — 45) ) 
slit ak *\V2(A2000 — 20119 + doz)? >> Gs Ars eo 
| V/2(r2080 — 2)1196 + hoo)” ee $! A094 

Mo 


The semi-invariants of interest in the above equation are given below 
and were determined using the results of the previous section for the 
ease “‘all impulses random,” subject to kQ < 7. 





_ 1 = Xo _ 2Qakp 
MOS trig) gs es 
pol | ee OE! TE , §6 eo) 
20 = | (i — #) uu 5) a_— PP Bye 02 i eye By 
ajc tee a Da a a Ss Gere 2) 
40 8 — By) 31 rT (i — BP By 22 2 a py 
g* 
Mis = —(ak)* —— a — By (1 + 46° + B°) 
B 
ho = —2(ak)* a= (i — BS (1 a 11° - 116° Ae eB”). 


Using the above expressions for the ’s, the following quantities in (91) 
may be reduced to 




















10 _ 1 = 1 
(A20902 — 2d1180 + Doz)? /2(2Qrk)B a” 
(1 6) 1 6)? 
eae "1 2(2ak)*B" 
wal eae | an eas 
Pn 
| ASO = 6) 4 6880 By 
CP =6*) (Lo pt)? 
ABB) GL Ag 58) gl 8) 6 Ae sr, ), 
(L=—) (l= 87° 


or 
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r+s=4 x pe uy 
pape (—-1) 4 as a 


oe Se 


The probability density therefore takes the form 





_ (8 = 6)" » 4M 1 (Fae) . (92) 


4 
eee ral aa ri ay © oa 


This result is in the form of the standard Edgeworth approximation 
with @,, ¢, and 4 the mean, the standard deviation and the 4th semi- 
invariant of the 6 distribution, respectively. In the limit as Q becomes 
large (8 — 1) we approximate 1 — 6 by 7/Q and 


a> kVrQ 0, 2kQ 


The coefficient of the 4th Hermite polynomial approaches — (57/128Q). 
Iquation (92) then indicates the approach to the normal law with the 
first correction term going as 1/Q. The results for 6, and « correspond 
to those derived earlier by Bennett, Rice and others. 


APPENDIX E 


Determination of Omax 


For kQ & 7, a good approximation for 6 is (from Appendix B) 


a+ 2rk >> a,np” 
(Se SS ee (93) 


b+ Dy, a3" 


When a, = 1, we have 


a foo) 
Be ok Se 2 annB fis 
2Qrk 


1+o+ >) a6” 
n=l 


It is of interest to determine the pulse pattern that yields the maximum 
value of 0/2rk: This is equivalent to the determination of a one-zero 
sequence of a,’s such that (94) is a maximum. 
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Assume that an initial pattern has been chosen such that 6/2rk = 
A./B,. If a single a, is changed from zero to one (pulse added), then 
6/2xrk is changed to (A, + n6")/(B. + 6"). Clearly, we should effect 
this conversion if 


Ao 


A, + ng” A, 
Bo + Bt = B 


: 


or 


IV 


Ag 
n B. (95) 


On the other hand if a one is changed to a zero (pulse removed), then 
6/2xk will be increased if 


Ay — nB" 5 Ao 
B, = co” B, 
or 
Ag 
n< B,. (96) 


The process is continued in this manner until all a, = 1 for n 2 n, and 


all a, = 0 for n < n, (except a,, which is constrained to be unity). 
n- may be determined from the above process, since 





Omax ek ui = ne 
ae oe ae: (97) 
at 
oe a 
which can be rearranged to 
Netl 
opi 7 agg t md + 8. (98) 


When a periodic pulse pattern of 1 out of every M pulses is forced, 
Omax 18 found in the same manner as above and the relationship between 
the various parameters to achieve this maximum is given by (15) of 
the main body of the paper. 
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Properties and Design of the Phase- 
Controlled Oscillator with a 
Sawtooth Comparator 


By C. J. BYRNE 


(Manuscript received September 1, 1961) 


A sawtooth phase comparator has advantages over the more common 
sinusoidal comparator in a phase-controlled oscillator because its output 
is linear for larger values of phase error. For some applications, 1t is no 
more complex or expensive than the sinusoidal comparator. 

This paper analyzes properties of the phase-controlled oscillator with a 
sawtooth comparator that have been mentioned in the literature for sinu- 
soidal comparators. In addition, there 1s new theoretical material on the 
effect of fast jitter and noise. 

The properties of the circuit are presented in a manner which is con- 
venient for design. 

Since it ts easier to analyze the circuit with a sawtooth comparator, many 
applications of the device have been considered. Because of this wide view- 
point, the paper may be helpful in understanding the phase-controlled 
oscillator in general. 
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I. INTRODUCTION 


The phase-controlled oscillator (see Fig. 1), otherwise known as the 
phase-locked oscillator, is often used to produce a signal whose frequency 
and phase are controlled by an input signal. The literature’”’ on the 
subject assumes that the phase comparator, which is the error detector 
of the loop, produces an output which is proportional to the sine of the 
phase difference. 

This paper considers the case of the sawtooth comparator, whose 
output is a linear function of the phase difference over a periodic range 
(see Fig. 2a). Because of this linearity, the sawtooth comparator is 
superior in operation to the sinusoidal comparator for some applica- 
tions. In general, the sinusoidal comparator is simpler and cheaper, but 
in applications involving digital signals, the two are comparable in cost 
and complexity. 

The purpose of this paper is to present a comprehensive survey of 
many properties of the phase-controlled oscillator, relating to many 
different applications. We have drawn heavily on the literature, modify- 
ing the analysis to make it apply to the sawtooth comparator. In addi- 
tion, there is new theoretical material on the effect of fast jitter and 
noise. New results derived by A. J. Goldstein in a companion paper’ 
are presented in an abbreviated form, more suitable for design. 
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Most of the properties are presented in a graphical form which facili- 
tates design. 


II. DESCRIPTION 


2.1 General 


The block diagram of a phase-controlled oscillator is shown in Fig. 1. 
Notice the resemblance to a negative feedback amplifier or a servo loop. 
There is a forward gain path, a feedback path, and a subtracting or 
error detecting device. 

The input and output signals are not the voltages themselves, but 
are the phases of the nearly periodic voltages. If the input and output 
voltages are at different frequencies, dividers or multipliers must be 
used to bring them to a common frequency at the phase comparator. 
In this paper, we will assume that the output and the input are at the 
same frequency. We will however, consider the use of dividers to allow 
the comparator to operate on the Nth submultiple of the input and out- 
put frequency. We will measure phase of the submultiple signals in 
radians of the original frequency. 


2.2 Phase Comparator 


The phase comparator is the error detector of the servo loop. It pro- 
duces a voltage which depends on the phase difference between the input 
submultiple and the output submultiple. 

Of course, the comparator cannot distinguish between different cycles 


VARIABLE 
PHASE FREQUENCY 
COMPARATOR FILTER OSCILLATOR 


DIVIDER 





INPUT 
aed OUTPUT 








FORWARD GAIN: 
_ A, Oe Xs 
~ Ss 


Fe=Pi-% 





“ H(s) = 2H(s) 


DIVIDER 


Fig. 1 — Block diagram of the phase-controlled oscillator. 
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of the input and output submultiples. Therefore, its output must be a 
periodic function of the phase difference between input and output, with 
a period equal to one cycle of the submultiple frequency or N cycles of 
the input and output frequency. We see that the greater the divider 
ratio, the greater the range of the phase comparator, in cycles of the 
input and output frequency. 

The sawtooth and sinusoidal comparator functions are shown in Fig. 2. 
The phase error is measured in radians of the input and output fre- 
quency. The gains have been adjusted so that the slopes at zero are 
identical. This means that the functions have the same small-signal 
performance at zero quiescent phase error. Note that the peak output 
of the sawtooth comparator is 7 times the peak of the sinusoidal com- 
parator. 

The sampler and mixer types of sinusoidal comparator are described 
in the literature.” 

Since the sawtooth characteristic is not common, we will describe 
oie method of building such a comparator. We assume that the input 
and output signals are available as short pulses. If the signals are orig- 
inally sinusoids, the pulses can be obtained from zero crossings. As 
shown in Fig. 3, these pulses control a flip-flop. The input is sent into 
the set terminal of the flip-flop and the output is sent into a comple- 


+ 77NQ, 






OUTPUT 


(@) SAWTOOTH CHARACTERISTIC 


+NQ, 
/-27TN -77N\ Jo +77 J +277N 
“NQ, 


(b) SINUSOIDAL CHARACTERISTIC 


Fig. 2 — Characteristics of the sawtooth and sinusoidal phase comparators. 
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Fig. 3 — Flip-flop sawtooth phase comparator. 


ment (or count) terminal. Therefore the time spent in the set state 
will be the time between the input pulse and the output pulse. 

If the flip-flop puts out a positive voltage in the set state and an equal 
negative voltage in the reset state, the average output voltage will be 
a linear function of the phase error. The average output will be zero 
when the pulses are 180° out of phase. Therefore one of the signals 
should be inverted before pulse forming if the output is desired to be 
in phase with the input. 

If the phase error exceeds +7, the pulses will pass each other. There 
will be a sudden discontinuity, and the voltage will change quickly from 
one extreme to the other. 

If the input signal is turned off, the flip-flop acts like a binary counter, 
driven by the output signal. The average output voltage will be zero. 

The average voltage will be extracted from the flip-flop output by the 
low-pass filter. It should have a cutoff frequency low enough to remove 
signal components near the submultiple frequency. 
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Since this type of comparator works on zero crossings, its conversion 
gain is independent of signal amplitude. 

A sampler comparator can also have a sawtooth characteristic, if the 
input has a sawtooth waveform. 

Because of the operation of the phase comparator, the phase-con- 
trolled oscillator is really a sampled system. E. G. Kimme has shown’ 
that the phase-controlled oscillator can be treated as a continuous sys- 
tem if the sampling frequency is so high that its effects are strongly 
attenuated by the closed loop. We will assume this to be the case through- 
out this paper. 


2.3 Filter 


The filter has a low pass characteristic to attenuate fast changes in 
the phase error due to noise in the input signal. It also helps to smooth 
out the high frequency component of the phase comparator output. 
Usually a simple RC filter or a phase lag filter is used, as shown in Fig. 4. 


2.4 Oscillator 


The variable oscillator produces the output signal. When its input 
voltage ve is zero, the output frequency is the design center frequency 
w,. If v2 is not zero, the output frequency varies in proportion to ve. 
Since the important property of the output is the phase, which is the 
integral of the frequency, the variable oscillator acts like a perfect in- 
tegrator. 


III. OPERATION 


Readers who have a background in servo systems may find it helpful 
to think of the phase-controlled oscillator as a type 1 servo system, such 


Vy R, Vo Vy Ry Vo 
T,=R,C T=(R,+R2)C 
T1=QR,C Ro T,= QA(RitRe)C 
Cc T2=0 To= RoC 

| T2=0 C T2=Q&Ro2C 

1 (+STo 

s)=-—— s)= —— 

_t BIS) 1+STy L H(S) i+ST, 

(a) R-C FILTER (b) PHASE LAG FILTER 


Fig. 4 — Filters. 
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as a velocity motor with position feedback.’ The analogy is clear from 
Fig. 1. 


3.1 Aligned Operation 


Let the frequency of the input signal be identical to the center fre- 
quency of the oscillator and let the phase error be zero. Then the input 
to the oscillator is zero and its frequency will be identical to that of the 
input. 

Now let us quickly advance the input phase by a small amount and 
continue at the center frequency. There will be a positive error voltage 
which will increase the output frequency. The output phase will advance 
until it catches up to the input. The circuit cannot settle down until 
the output phase is identical to the input phase, because of the integrat- 
ing action of the oscillator. 


3.2 Mistuning 


Assume that the input frequency increases a little, causing the input 
phase to continually advance. As before, a positive error signal will 
result, increasing the output frequency. Therefore the output phase will 
continually advance. When the circuit settles down to a steady state, 
the phase error will be constant, and just sufficient to detune the oscil- 
lator so that its frequency will be identical to the input frequency. The 
greater the phase-to-frequency gain of the forward path, the less phase 
error will result from a given input frequency deviation. 


3.3 Jztter 


Now let the average input frequency be constant, but assume that 
the phase is jittering back and forth. Suppose the jitter is very rapid. 
Even if there were no filter, the integrating action of the oscillator would 
smooth out the jitter so that the output would be more stable than the 
input. The low-pass filter, of course, smooths the error signal before it 
gets to the oscillator and attenuates the jitter even more. 

If the amplitude of the jitter is too great, the phase comparator will 
go through a discontinuity, and when the circuit settles down again, it 
will have slipped N cycles of the input, either ahead or behind. 

As the rate of jitter decreases, the operation of the loop becomes more 
complex. Because of the integration, jitter in the oscillator phase lags 
the fluctuations in its input voltage by 90°. If the low pass filter also 
has about 90° phase lag at some frequency of jitter, we see that we have 
positive feedback instead of negative feedback. The open loop phase 
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gain is the ratio of a change in output phase to a change in phase error. 
If this is large enough at a frequency where we have positive feedback, 
we can actually have an increase in jitter, or even a jitter oscillation’ 
which would destroy the usefulness of the device for most purposes. 

If the jitter is very slow compared to the loop time constants, the 
servo loop will track it, and the jitter will be passed on to the output. 

If the jitter is distributed in a wide band, such as that caused by the 
addition of white noise to a coherent signal, the circuit will respond only 
to that jitter resulting from noise components near the frequency of the 
coherent signal. Therefore the circuit can be used to enhance the signal- 
to-noise ratio of a phase-modulated carrier. This property also allows 
the circuit to lock on a coherent signal of approximately known fre- 
quency although it is surrounded by strong wide-band noise. 


3.4 Phase Modulation 


The error signal v, (see Fig. 1) is essentially proportional to the phase 
modulation of the input at frequencies higher than the circuit can track, 
and to the frequency modulation at frequencies that can be tracked. 
The signal at v2 is filtered to reduce noise. Therefore the circuit can be 
used as a demodulator of phase or frequency modulated signals in noise. 

The circuit can also be used as a phase modulator. The carrier is con- 
nected to the input. The modulating voltage is added to the output of 
the comparator. The feedback tends to keep the oscillator input voltage 
small. Therefore the comparator output must be nearly equal to the 
negative of the modulating voltage. This means that the output phase 
is nearly proportional to the modulating voltage. At high frequencies, 
the loop gain drops, and these relations are no longer valid. 


3.5 Quieting 


If the input signal is smooth, but the oscillator itself is jittery because 
of internal noise, the oscillator will be quieted by the feedback, especi- 
ally at low frequencies where the problem is likely to be most serious. 


3.6 Discontinutties 


We have looked at the small-signal linear performance of the phase- 
controlled oscillator; now let us examine its operation when it is passing 
through discontinuities. Suppose we increase the input frequency until 
the phase crror is nearly equal to +Nz, where N is the divider ratio. 
A small further increase will cause the phase comparator to go through 
a discontinuity, making the error —N7z. This will start to decrease the 


PHASE-CONTROLLED OSCILLATOR 567 


oscillator frequency and the error will rapidly return to +N7z, and then 
jump to —N7n again. After a short time, the error will settle down to a 
periodic behavior, with discontinuities at regular intervals. Since the 
average error must be somewhat less than +N7z, the average output 
frequency will be somewhat less than the input frequency, and the fre- 
quency of the phase error will be the beat frequency between input and 
output, divided by NV. 


3.7 Pull-in 


As the mput frequency is reduced in this ‘flickering’ state, the beat 
frequency decreases. Finally, the phase error does not quite hit a dis- 
continuity at its highest excursion, and the error settles down to a static 
value. We say the loop has pulled into lock with the input. 

Depending on the nature of the filter, there may or may not be hys- 
teresis in the pull-in action. If there is hysteresis the pull-in frequency 
deviation will be less than the deviation which ean be held in lock, once 
lock has been established. 


IV. APPLICATIONS 


The phase-locked oscillator has many interesting capabilities, and 
consequently has found many diverse applications.’ Some of the func- 
tions and examples of use are: 

a. Locking a high frequency signal to a submultiple; television sync 
signals are locked to the power frequency. 

b. Locking a strong steady signal to a weak, intermittent signal; 
television color carrier recovery. 

c. Locating and locking on a weak coherent signal in wide-band noise; 
space communication. 

d. Detecting phase or frequency shifts in a signal; space communica- 
tion. 

e. Smoothing a jittery signal; smoothing jitter in a digital signal. 

f. Locking a high-power oscillator to a more stable low-power oscil- 
lator; microwave generation. 

g. Phase modulation of a reference carrier. 

h. Frequency synthesis. 

Each of these applications requires a different viewpoint in analyzing 
the circuit. An optimization process for one application may be useless 
in another. Even an expression such as noise bandwidth may not have 
the same meaning with a jitter reducing circuit as with a microwave 
source. 

The application we have foremost in mind is that of capturing and 


568 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


smoothing a jittering timing signal for a digital channel. Most of the 
properties we analyze are chosen for this application. However, we 
present additional material which is needed for other applications. We 
have attempted to be explicit in revealing our viewpoint when we define 
noise bandwidth, figure of merit, etc. 


V. QUIESCENT OPERATION 


5.1 Steady-State Error 


If a phase-locked oscillator is synchronized with a signal whose fre- 
quency is not identical with the oscillator’s center frequency, there 
must be a steady phase error. The comparator converts this phase error 
into the voltage required to tune the oscillator so that its output fre- 
quency will be identical to the input frequency. 

The gain @ is the low frequency conversion gain from phase error to 
frequency (see Fig. 1). It is the change in output frequency (in radians 
per second) that results from a change in phase error of one radian. The 
mistuning frequency w, is the difference between the input frequency 
and the oscillator center frequency. Then the steady phase error is 


ge =. (1) 
a 
The phase error is directly proportional to the mistuning. With a 
given mistuning, the error may be made as small as desired by increas- 
ing the gain, a. However, we shall see that high gain has undesirable 
effects also. 


5.2 Lock Frequency 


The greatest frequency mistuning that can be locked in synchronism 
is determined by the maximum output of the phase comparator. At the 
limit, 

lwm| = wr, = Naa. (2) 
We call w, the lock frequency. 


5.3 Phase Error Margin 


One of the advantages of the sawtooth comparator over a sinusoidal 
comparator is that the small-signal performance is independent of the 
steady mistuning, since the gain does not depend on the phase error. 
However, mistuning reduces the margin between the steady phase error 
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and the error which will cause a discontinuity. This limits the permis- 
sible peak jitter amplitude, if no discontinuities are allowed. 
The phase error margin is 


dn = Ny — Lom! (3) 


Qa 


VI. RESPONSE IN THE LINEAR REGION 


As long as the circuit is in synchronism and the phase error does not 
exceed the bounds of +Nz, the phase controlled oscillator acts like a 
linear feedback system. 


6.1 Phase Response 


From Fig. 1, we see that the forward gain of the loop is the product 
of the gains of the comparator, filter, and oscillator: 


anon | 


H(s) 


8 ? 


(4) 


=a 





where A = AyA20. 


The feedback is 
=i, (5) 


The response of the output phase to changes in the input phase is 
given by the familiar negative feedback equation: 


ae = a Bb = aH(s) (6) 


The signals ®; and ®, are phases of the input and output voltages. 
The phase error, as a function of the input phase, is 


§ 


s + aH(s) ma 


Notice that we measure phase of the submultiple signals in radians of 
the original signals. 

The filter is usually either an RC filter or a phase lag filter, as shown 
in Fig. 4. For the phase lag, the more general case, 


_1lt+st, 
ee eT? 


® = © —-8 = (7) 


H(s) (8) 
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where 


T; 


IV 
— 
a 


In the RC case, T. = 0. 
When we substitute (8) into (6), the transfer ratio becomes 


(ee eae es 
Y = ——____*—_., (9) 
ee as ae 
a a 
where 
N1= aT; , > als. 


The phase error response is found from (7): 


‘(4 22) 
a eae a ell eT 


1+eH ty oh 


a 


(10) 


Note that the denominator of transfer functions (9) and (10) is a sec- 
ond order polynomial, of the form 


2 
1te%4e(4), 
Wn 


Wn, 
where: 


tT + 1 
Vai 
Equations (9) and (10) appear frequently in the literature, but have 
been included here for completeness. Some of the literature’ uses the 
natural frequency w, and the damping ratio é as defining parameters 
of the system. We shall use a, 7, and 7. more often, because they are 

more closely related to physical quantities. 

Most of the important properties of the phase-controlled oscillator 
can be expressed as normalized ratios which are independent of a. There- 
fore we shall present these properties as functions of the two remaining 
design parameters, 7; and 72. As an example of our method of presenta- 
tion, contours of constant damping ratio — are shown on a plot of 72. vs 
7 In Tig. 5. We will call this the filter plot. Properties of the filter plot 
are discussed in Section IX. 





f= (11) 





bo] = 
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Fig. 5 — Contours of constant damping ratio on the filter plot. 


6.2 Voltage Response 


As we have mentioned, the phase-locked oscillator can be used as a 
phase modulator by adding a modulating voltage vy to the error voltage 
v,. The response of the output phase is: 


nen aia (12) 
a1 


where Y is given by (6) and (9). Note that we have used Vx for the 
transform of vy. Examination of (9) shows that the output phase will 
follow the input voltage as long as the modulating frequency is low 
enough, since Y approaches unity as s approaches zero. 
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If the phase-locked oscillator is used as a demodulator the output 
can be taken before or after the filter. Therefore we present the response 
equations for the voltages at each point (see Fig. 1). 


V1 


a (1 << Y )®; (13) 


Ve 


Are: (14) 
Q3 


VII. SMALL-SIGNAL PROPERTIES 


The small-signal properties we shall analyze are the response of the 
output phase to sinusoidal jitter of the input phase, the noise bandwidth, 
the peak jitter gain, the response to a step change in phase, and the 
response to a step change in frequency. All of these effects are not per- 
tinent to every system, but each is useful in some of the applications. 


7.1 Sine Wave Jitter Response 


The small signal transfer ratio Y between input phase jitter and out- 
put phase jitter was given in (9). For sinusoidal jitter, the squared 
magnitude (power gain) of Y(w) is 


“Qe 
| ¥ Cw) |? = —— a |$ ——_—... (1) 
Lp (2) hs 2(71 = T2) sh | = (:) m1 


The phase of Y(w) is 


6(w) = tan” (:) 7: — tan fe (16) 


The jitter attenuation curves for several sets of filter parameters are 
plotted in Fig. 6. 

In Case I, no filter, we have simply an integrator with unity feedback 
around it. At low frequencies the jitter is not attenuated; at high fre- 
quencies there is a 6 db per octave roll-off. When an RC filter is added, 
the additional high frequency attenuation produees a 12 db per octave 
roll-off. When the filter time constant is very large, the phase shift in 
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Fig. 6 — Jitter attenuation with various filter parameters. 


the forward loop results in positive feedback, and causes a region where 
jitter is amplified. 

When a phase lag filter is used, the second break point caused by the 
resistor in series with the capacitor can be used to stabilize the feedback 
loop and reduce the peak jitter gain. Since the attenuation of the phase 
lag filter is constant at high frequencies, the final slope is 6 db per oc- 
tave. 


7.2 Noise Bandwidth 


One of the functions of a phase-controlled oscillator is to reduce noise. 
In the absence of better information, it is usual to assume that some- 
where in the system the noise is white and Gaussian. Since most of the 
noise at the output is usually restricted to a narrow band by the filter- 
ing action of the circuit, it is convenient to express the amount of noise 
that remains as the bandwidth of an ideal filter (i.e., rectangular filter) 
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that would pass the same mean square noise. The familiar formula for 
computing noise bandwidth is 


pe [ (ee) de, (17) 


where G(w) is the normalized transfer function between noise input and 
noise output, and B is in radians per second. The transfer function which 
is used for G(w) will depend on where the noise input and output are, 
and this in turn will depend on the application. 

When the phase-controlled oscillator is used to clear up jitter in dig- 
ital signals, the appropriate transfer function is Y, the ratio of output 
phase shift to input phase shift, as given in (9). We will call the noise 
bandwidth of Y the jitter bandwidth B;. When we substitute (9) into 
(17), we have 


2 

T2 

By A Pe 
ma 21+ 7° 


We recall that Naa is the lock frequency. An increase in N increases the 
lock frequency without changing B; . 

For no filter, or for any RC filter, the normalized jitter bandwidth is 
1, For 72/7, much greater than 1, the normalized jitter bandwidth 
approaches 4(72/7,). The jitter bandwidth is shown on the filter plot 
in Tig. 7. 

With a sawtooth phase comparator, the jitter bandwidth is inde- 
pendent of the mistuning. This is not true of the sinusoidal comparator. 
The jitter bandwidth for the sinusoidal case is 





(18) 


2 
1+ ice COS Pe 
a (19) 


COS Ge 


al& 


1 
2 1 + Te COS Ge 
where @ is the gain at zero error. 

Equation (19) can be obtained from (18) by replacing a by (a@ cos 
y-), the small signal gain at a quiescent phase error y. . Note that a is a 
factor in 7; and 72. Notice that the jitter bandwidth for the sinusoidal 
comparator decreases as the mistuning (and therefore ¢,) increases. 

Now let us consider the effect of interference due to broad-band noise 
added to the input signal. To justify a small signal analysis, we must 
assume that filtering limits the total energy of the interference, to keep 
it well below the signal level. However, we assume that the filtered 
noise is essentially flat in a band around the signal which is much wider 
than the interference noise bandwidth which we shall derive. 
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Fig. 7 — Contours of normalized jitter noise bandwidth on the filter plot. 


The effect of interference depends strongly on the type of phase com- 
parator in the system. We shall analyze the linear zero-crossing case 
and the sinusoidal mixer or sampler case. | 

Interference noise disturbs both the phase and the amplitude of the 
input signal. When a zero-crossing comparator is used, only the phase 
disturbance is detected. If the noise power density is v,2 (volts” per 
radians per second) and the input sinusoid has a peak v;, the jitter 
“power” density for phase in radians is (v,/v;)” (radians’ per radians 
per second). . 

The output jitter will be 





— 2 : 
oe = By (20) 


)] 
vi 


The effect of broad band input noise is quite different when a sinu- 
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soidal sampler or mixer phase comparator is used. The following dis- 
cussion assumes that the reader is familiar with the literature of the 
sinusoidal comparator. With this type of comparator, noise at the input 
produces a voltage at the output of the comparator which is independent 
of the amplitude and phase of the input signal. The noise density of the 
comparator output voltage is (a,/v;)"v,2 where v; is the expected peak 
signal amplitude used in computing the expected a at zero error (if no 
limiting is used with this type of comparator, the gain depends on the 
signal amplitude). When the comparator is connected in a feedback 
loop, the appropriate method of analysis is to consider the interference 
noise injected at the output of the phase comparator. The appropriate 
transfer ratio is that previously used for modulation in (12). 

The interference bandwidth B, can be found by substituting (12) into 
(a); 


B; 


To 


2 
1+ (= cos es) 
—1 1 
YT (008g) 


This is the noise bandwidth given by Rey.’ 

Notice that the interference bandwidth B; increases as the phase 
error increases, while the jitter bandwidth B; decreases. The reason for 
the difference is that the sampler and mixer comparators are sensitive 
to the amplitude of the input signal as well as the phase. 

Now we can compare the output phase noise performance of the linear 
zero-crossing comparator with the sinusoidal sampler or multiplier type. 
If they have the same gain at zero error, they will have the same re- 
sponse to jitter and interference at zero error. In the presence of mis- 
tuning, however, the sinusoidal comparator will be more sensitive to 
interference and less sensitive to jitter while the linear comparator will 
not change. 

When the phase-controlled oscillator is used as a demodulator, still 
another definition of noise bandwidth is required. If we take the output 
signal after the filter, which cuts off some of the noise, and assume that 
interference noise is added to the input signal, we have for the zero- 
crossing detector, 





= 5 (cos Le 


Dh = E y| a Via (22) 
a V1 

By substituting the expression in brackets into (17), we can find the 

demodulator noise bandwidth, Bp . This bandwidth is not finite for the 

phase lag filter, because the transfer ratio does not approach zero at 
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high frequencies. Therefore higher order filters are desirable for this 
application. 


7.3 Peak Jitter Gain 


We have shown in lig. 6 that it is possible for the jitter transfer ratio 
to be greater than unity. In most systems, this is not very harmful. 
However, where phase-controlled oscillators are connected in cascade, 
gain can be very troublesome. 

We can find the peak gain | Y | by examining (15) for its maximum. 
The frequency at the peak is 


@) = 4[( fe (2) =a) = il) — i]. (23) 


A. J. Goldstein* has shown that the square of the peak magnitude can 
be written 


1 


de (24) 
0 


An examination of (23) shows that there is no peak, and the gain is 
never greater than unity if 


| P= 


T — T2 < 3. (25) 


The peak gain is shown on the filter plot in Fig. 8. 


7.4 Response to a Step Change in Phase 


Fast phase changes can occur because of quick changes in the trans- 
mission path or because the signal has been deliberately modulated. 
When a step in phase occurs there is a sudden change in the phase error, 
since the phase of the oscillator cannot change instantaneously. The 
error signal controls the oscillator so that the error returns eventually 
to its quiescent value. 

To act like a step change, the phase shift does not have to be instan- 
taneous, as long as the rise time is much less than the shortest time 
constant of the phase-controlled loop. Therefore if the phase comparator 
works from a subharmonic of the input frequency the amplitude of the 
phase change can be several input periods, as long as the change is slow 
enough for the subharmonic generator (counter, etc.) to follow, but 
faster than the loop time constants. 

If a counter is used as a subharmonic generator, an error in the counter, 
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Fig. 8 — Contours of peak jitter gain on the filter plot. 


or an extraneous pulse introduced into the counter, will act like a step 
change in input phase. 

The response of the phase error to the phase input is given in (10). 
When the input phase is a step of amplitude Ag; , the time response of 
the phase error can be shown to be 


Wn, 


ve = Age **" | cosh (V2 — lot) — £ 


Vea 





(26) 
-sinh (f# — lw,t) . 
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- For the underdamped case ( < 1) the hyperbolic functions in (26) 
become trigonometric functions. The damping ratio ~ and the natural 
frequency w, have been defined in (11). 

At ¢ = 0, just after the step, we see that the phase error equals the 
change in input phase. If we examine the initial derivative of (26), we 
find that it is never positive. This means that the phase error will never 
exceed its initial value. 

Some examples of the phase error response to a step change in phase 
are shown in Fig. 9. 


7.5 Response to a Step Change in Frequency 


A sudden change in frequency can occur because of a change from 
one source to another, because of malfunction, or because the signal has 
been modulated. When a frequency step occurs, the error signal builds 
up until the oscillator frequency catches up to the input frequency, 
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leaving a static change in phase error. If a low-pass filter is used between 
the phase comparator and the oscillator, the transient phase error can 
have a peak value much greater than the quiescent phase change. 

Let us assume a frequency change Aw;. This is equivalent to a ramp 
phase input, Aw. We can use (10) to find the response of the phase 
error: 


= Aw; 


a 





Le 1 — & "| cosh (1/2 — font) - 


(27) 


a 


= ar sinh (f/f mars Lunt) 


Some examples of the phase error response to a step change in input 
frequency are shown in Tig. 10. 
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The peak phase error is of particular interest. For the overdamped 


case it is 
tne Ea, Aw; 
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Pe 








; é 1Ve2—1 
1 aus xp/ -——2—— tanh? VS \ |. (28 
+ (171 — 72)? exp aa an ar (28) 


a 


For the underdamped case, the inverse hyperbolic tangent is replaced 
by the inverse trigonometric tangent. The value of this angle is between 
zero and 7. An expression closely related to (28) has been derived by 
R. D. Barnard,’ as a capture condition. 

A large value of 7; can result in overshoot which is many times the 
quiescent phase error. This means that the response of such a system 
to a sudden frequency shift looks like a pulse. This characteristic is 
useful in demodulation of a frequency shift signal. 

The large overshoot can throw the loop out of synchronism if it ex- 
ceeds the capacity of the phase comparator. This effect will be discussed 
more fully in Section 8.5. 

The normalized peak phase error is shown on the filter plot in Fig. 11. 


VIII. LARGE-SIGNAL PROPERTIES 


We have examined the operation of the synchronized phase-controlled 
oscillator when the error is within the range of the phase comparator. 
For this “small-signal” case, the problem was completely linear. When 
the circuit is not in synchronism, or when disturbances of the input 
signal are large enough to produce a phase error which exceeds the range 
of the comparator, discontinuities are present in the output and the 
problem becomes nonlinear. Despite this difficulty, we have been able 
to analyze certain large-signal properties of the phase-controlled loop 
with a sawtooth comparator. These are the pull-in frequency, the seize 
frequency, the settling time, the maximum allowed frequency shift, 
and the effect of certain types of jitter on large-signal performance. 


8.1 Pull-in Frequency 


A very important property of the phase-locked loop is the range of 
frequencies that can pull the oscillator into synchronism. In general, 
this range is smaller than the range of frequencies which can be held in 
lock. When the system is not synchronized, the phase comparator goes 
through periodic discontinuities, which prevent the loop from synchro- 
nizing. Whether or not a loop will pull a given frequency into lock de- 
pends on the past history of the loop and the jitter of the signal. 

We define the pull-in frequency as the maximum steady mistuning 
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Fig. 11 — Contours of normalized peak phase error caused by a step change 
in frequency. 


of the input frequency that will always pull the circuit into synchronism. 
Frequencies outside of the pull-in range but inside the lock range may 
or may not be pulled in, depending on the initial conditions. 

We can determine the pull-in frequency experimentally by mistuning 
the input beyond the lock frequency and then slowly reducing the mis- 
tuning until the circuit locks. When the mistuning exceeds the lock 
range, there are frequent discontinuities in the phase error; it appears 
to “flicker.” As the mistuning is slowly decreased, the flicker rate de- 
creases. 

When the mistuning is brought down to the pull-in frequency, the 
flicker mode becomes unstable. With the mistuning then held constant, 
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just under the pull-in frequency, the phase error trajectory from dis- 
continuity to discontinuity slowly changes as shown in Fig. 12. Finally, 
the error misses a discontinuity and synchronism is achieved. 

The pull-in frequency, then, is the mistuning for which the stable 
asynchronous mode disappears. For lower values of mistuning, the cir- 
cuit must eventually reach a synchronous condition since there is no 
asynchronous solution. 

A. J. Goldstein* has found an exact answer for the pull-in frequency 
Wp 5 

Wp 1-—D 


Re ee 1 
Nra tanh d£w,7"5 + (D) tanh 3£0.To. (29) 


where TJ), the critical flicker period, is the smallest positive solution of 


So ape! tanh $£wn7'o 
ne pee 
Vaile 1 als 


= vat-B(1-4/1-2) = Gi, 
T2 71 
and D is given by 


_ avn é — DY — ale - D 
= cr — m1(2 = 1) 





For the underdamped case (damping ratio & < 1) the hyperbolic tan- 
gent is replaced by the trigonometric tangent. 

A. J. Goldstein* has used a digital computer to evaluate (29). The 
data is presented on the filter plot in Fig. 13. 

We can see from (29) that the pull-in frequency is directly propor- 
tional to the lock frequency Nza, for a given set of parameters 7, and 
t2. We will call w,/Nzaa the pull-in to lock ratio, or the relative pull-in. 
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Fig. 12 — Scope trace of the phase error after the mistuning is brought just 
below the pull-in frequency. The flicker mode becomes unstable. 
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Fig. 13 — Contours of the pull-in to lock ratio on the filter plot. 


We have shown that the small-signal properties of the phase-con- 
trolled oscillator are completely specified by the parameters 7; , 72 and 
a. Therefore, for constant small-signal performance (such as noise band- 
width), the pull-in range is proportional to the count ratio N. We can 
get any pull-in frequency we wish by using a large enough count ratio. 

There are two limitations on increasing the count ratio. The first is 
economy; high counts require more equipment. The second is theoretical. 
The comparator supplies data only once every period of the submultiple 
frequency. For our analysis to be valid, the submultiple frequency should 
be much higher than the cutoff frequency of the forward path, which 
is of the order of w, . This limits the maximum count. 
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For rt. > 1, and 72/7, < 0.5, the pull-in frequency approaches 


Ow — T2 
Naa = V3 4/2. (30) 
It is interesting to compare the pull-in frequency of a sawtooth com- 
parator to that of a sinusoidal comparator! with the same gain at zero 
error. The normalized pull-in frequencies for both types of comparator 
are shown in Fig. 14, for a damping ratio é of 4. 
Fig. 14 shows that the sawtooth phase detector has a pull-in frequency 
at least twice that of a sinusoidal detector which has the same small- 
signal performance. 





8.2 Figure of Merit 


In most applications, a large pull-in frequency and a small noise band- 
width are desired. Unfortunately, these requirements are antagonistic, 
since a small noise bandwidth means that the loop cannot react to a 
rapidly flickering phase error. Examination of the formulas for pull-in 
(29) and jitter noise bandwidth (18) shows that both are proportional 
to the gain, a. Therefore a natural figure of merit is the ratio of pull-in 
frequency to the jitter noise bandwidth: 


— 
M = Be (31) 


Since the pull-in frequency is proportional to the count ratio N while 
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Fig. 14 — Normalized pull-in of the sinusoidal and sawtooth comparators for 
a damping ratio of 1/2. 
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the noise bandwidth is independent of N, the figure of merit is propor- 
tional to N. This means that we can get as large a value of pull-in as we 
wish for a given noise bandwidth, if we-are willing to use a large count 
ratio. 

The normalized figure of merit M/N is shown on the filter plot in 
Fig. 15. 

D. Richman’ has defined a different figure of merit, since he wished 
to compromise between noise bandwidth and gain. His figure of merit 
is equivalent to our normalized noise bandwidth (18), plotted in Fig. 7. 
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Fig. 15 — Contours of normalized figure of merit on the contour plot. The 
normalized figure of merit is the ratio of the pull-in to the noise bandwidth, di- 
vided by the count ratio N. 
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8.3 Seize Frequency 


As long as the mistuning of a signal is less than the pull-in frequency, 
we can be sure the circuit will lock; but it may flicker for a long while 
before it does. 

For some applications, it is important that the circuit synchronize 
immediately on a signal that has just started, without flickering through 
discontinuities. We define the sezze frequency w; as the maximum mis- 
tuning of a suddenly connected signal that cannot cause a discontinuity 
after the initial phase jump (see Fig. 16). 

We have described a phase comparator which produces a zero error 
signal when there is no input signal. With such a device, the effect of 
suddenly connecting a signal is equivalent to a step phase shift of an 
arbitrary value between — Nz and +/N7z and a step change in frequency 
equal to the mistuning of the signal w,, . 

In the marginal case, the phase error between the oscillator and the 
signal at the instant of connection is nearly Nz. The seize frequency is 
the value of mistuning for which the initial derivative of the phase error 
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Fig. 16 — Scope traces of the phase error during capture. 
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is zero, so that no discontinuity results. It is easily shown that 


ee le (32) 


Nra 71. 





Note that a circuit with an RC filter (72 = 0) may go through a dis- 
continuity for any nonzero mistuning, if the initial phase shift is large 
enough. 

According to Richman’ the seize frequency for the sinusoidal com- 
parator is a(72/7,). As indicated by a comparison with (32), the seize 
frequency in general is simply 72/7, times the lock frequency. 


8.4 Settling Time 


The settling time is the time required for the phase error to settle 
nearly to its steady state value after a change in input conditions. If no 
discontinuity occurs, the settling time ¢, may be estimated to be the 
time at which the damping term ¢ ***"* [in (26) and (27)] decays to 0.1. 
Then, substituting for € according to (11), 

4.6 
ts = aera | T.. (33) 

If a discontinuity is crossed, an additional time will be required to 
allow the flickering to die out. During each flicker period a small charge 
is added to the filter capacitor, bringing the average output frequency 
of the oscillator closer to the input frequency. Finally, the circuit locks. 

The flicker time for a given mistuning depends on the initial conditions, 
especially on the initial capacitor voltage. For the special case of a sud- 
denly connected signal (initial capacitor voltage zero), D. Richman has 
derived® an approximation for tp, the time in the flicker state, for the 
sinusoidal comparator. 

He assumes that the capacitor voltage does not change appreciably 
during a single flicker period; in effect, he replaces the capacitor with a 
variable battery. Further, Richman neglects the effect of the initial 
phase. By applying his methods to the sawtooth comparator, we ob- 
tain: 


man _ : (=) 
te / 2 SR) So eer oe 
T» Om/loL Wm T1 Wr T2 —1 { T1 @y Va 
— —({(——)+(1——)}coth (—— 
OL T2 WL TL T2 OL 


where w; is an “instantaneous mistuning”’ parameter introduced by 
Richman. 


(34) 
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Equation (34) is a good approximation for rz >> 1 and tr > ¢, . 

To carry out the integration, we must use numerical methods. We have 
plotted tr/T, against w,,/w, for various values of 72/7, in Fig. 17. Ex- 
perimental results are also shown in Fig. 17. 

Note that tr goes to zero aS w, approaches the seize frequency and 
to infinity as w, approaches the pull-in frequency. If a short pull-in 
time is important, the mistuning frequency should not be allowed to 
approach the pull-in frequency. 


8.5 Maximum Frequency Shift 


Consider a phase-controlled oscillator which is locked on an input 
which is frequency modulated by a digital signal (frequency-shift key- 
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Fig. 17 — Flicker time during pull-in. The time is zero for mistuning less than 
the seize frequency and infinity for mistuning greater than the pull-in frequency. 
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ing). Let us find the maximum frequency shift that will not cause the 
phase error to cross a comparator discontinuity. We assume that the 
center frequency of the oscillator is set midway between the two signal 
frequencies. We further assume that the time constants of the phase- 
controlled oscillator are much smaller than the maximum time between 
shifts, so that the circuit may be in steady state before the next shift 
occurs. 

We will consider the case of a sudden increase of frequency Aw;. The 
initial phase error is — (Aw;/2a). The maximum allowed phase error is 
+wNvr. Thus the peak change in phase error ¢, , caused by the maximum 
allowable change of input frequency Aé;, is 


Ab; 


a (35) 


ge = Na + 





The error ¢, has been given in (28). Solving (35) for Aé; in terms of 
6./Ad;, we have: 


Aé; 1 





Nra ag 1° (36) 


Ad; 2 
In the presence of mistuning, Nz in (35) and (86) is replaced by the 
margin ¢-, , given in (3). Values of a¢,/Aw; have been plotted in Fig. 11. 


8.6 Effective Comparator Characteristic in the Presence of Fast Jitter 


One of the functions of a phase-locked oscillator is to produce a steady 
output despite jitter and noise in the input signal. Therefore, we can 
expect that a major part of the phase comparator output will have fre- 
quencies much higher than the oscillator can follow. In such a situation 
only the low frequency component of the comparator output is signi- 
ficant in controlling the circuit. . 

The low-frequency component of the comparator output is the time 
average taken over a time interval which is longer than the period of 
the predominant jitter, but shorter than the response time of the cir- 
cuit. The following analysis assumes that such an intermediate time 
range exists; i.e., that there is very little jitter whose frequency is low 
enough to cause the circuit to respond. 

Let us write the phase error as the sum of a low-frequency component 
geo and a fast jitter component y,;. Then the instantaneous output of 
the phase comparator is f(¢eo + ¢.;). The average output of the com- 
parator is 
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v1 = = | er +: Pej) dt, (37) 


where T, is the averaging time. 
Let us define an effective comparator characteristic in the presence of 
jitter: 


| 1 = fi(geo). (38) 


This new characteristic governs the response of the circuit to slow phase 
changes in the presence of fast jitter. 

Now we assume that the time of integration is such that the time 
spent at each value of ¢.; is proportional to the probability density of 
ye; at the value. For random processes, this requires that 7, be much 
greater than the correlation time of the process. If ¢.; is periodic, it is 
sufficient that 7, be equal to one pericd. 

If this assumption is valid, we can replace the time integral (37) by 
an- ensemble integral: 


“+00 
file) = | S(ea + ¢3)0(¢s) dees, (39) 


where p(¢.;) is the probability density of the jitter. 

Equation (39) represents a smoothing operation by the jitter prob- 
ability function upon the comparator characteristic. If the density func- 
tion has even symmetry, (39) is analogous to the general filter equation 


+00 
vour(t) = fo vinlt — rir) de (40) 


where 7(7) is the impulse response of a hypothetical filter. 

The effective sawtooth comparator characteristic for Gaussian, sinu- 
soidal, and square wave jitter is shown in Fig. 18. These photographs 
were obtained by opening the phase-controlled oscillator loop and allow- 
ing the oscillator to free run. This means that the average phase error 
Yeo increases linearly with time. The phase comparator output was passed 
through a low-pass filter to obtain f;(g.0). 

Note that jitter always decreases the peak comparator output voltage. 

For Gaussian noise, we can evaluate (39) by neglecting the possi- 
bility of jitter crossing two or more discontinuities. Then the effective 
comparator characteristic for (—Na < ge < +N7) is 


1 
V 21 





—2y, we) 1 
F; (ee) = goo + N2x Lf e PI) da — . Fees | (41) 
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where 
Nr + ¢eo 
vy es 
(ge )rms 
Nr — 
w= te and 


(¢e)rms 


(y.)rms is the root mean square phase error due to fast jitter. 

The peak effective comparator output for Gaussian jitter is plotted 
in Tig. 19. 

The effective comparator function for the sinusoidal comparator is 
very easy to find, using the filter analogy: 


f(¢e) = sin Dey (42) 


Filveo) = genta: sin Pcd- 


Therefore the effect of high-frequency Gaussian jitter for the sinu- 
soidal comparator is simply to reduce the loop gain. 
In general, the presence of fast jitter causes a deterioration of large 
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Fig. 18 — The phase comparator characteristic in the presence of fast jitter. 
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Fig. 19 — Normalized lock frequency (peak comparator output) in the pres- 
ence of fast Gaussian jitter. 


signal performance. For example the lock frequency depends directly 
upon the peak comparator output, which decreases as jitter increases. 


8.7 False Synchronization Mode 


As shown in Fig. 18(d), the sawtooth comparator characteristic can 
have a region with positive slope centered on an average phase error 
Nz. This means that the circuit can synchronize in this region instead 
of the region near zero error. In this false mode the jitter continually 
crosses and recrosses the discontinuity. 

Fortunately, this undesirable mode 1s possible only for certain types 
and amplitudes of jitter. We can test for the possibility of the false mode 
by examining the slope of f;(ge0) at Nz. We can write f(g.) in the vicin- 
ity of Nx as ge. — 2NrU(¢e. — Nr), where U is the unit step function. 
Substituting this in (39) and taking the derivative, we have 


df;( geo) 
d( veo) Nir 


where p(0) is the probability density of ¢.; at 0. Therefore the false 
mode is possible when p(0) < 1/2Nz. 

For square wave jitter, p(0) = 0. Therefore the false mode is always 
possible. 

For sine wave jitter with an amplitude A,p(0) = 1/Az. Therefore 
the false mode is possible only when A > 2N. Since the comparator can 


= 1 — 2Nzp(0), (43) 
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accommodate only jitter error amplitudes less than +N in the normal 
mode, we are not likely to encounter sinusoidal jitter large enough to 
support the false mode. 

It can be shown by using the filter analogy that Gaussian jitter cannot 
produce the false mode; the slope of f;(Nz) is always negative. The p(0) 
criterion is not applicable in the case of Gaussian jitter because more 
than one discontinuity is involved. 

We see that, the false mode need be considered only for signals with 
jitter such that p(0) is very small. 

Tiven if the false mode has been established, a lull in the jitter will 
cause the circuit to jump to the normal mode. It will stay in the nor- 
mal mode even if jitter returns, as long as no discontinuities occur. 


IX. DESIGN METHODS 


We have analyzed many properties of the phase-controlled oscillator 
with a sawtooth comparator. Some of these properties, notably the lock 
range, pull-in range, and noise bandwidth are significant in nearly all 
applications of the device. Others, such as peak jitter gain, seize fre- 
quency, and settling time are important only for certain specific applica- 
tions. 

Usually, in a particular design problem, two or three of the properties 
will be of prime importance and the rest can be neglected. Then the 
problem is to find the values of the design parameters (a, 71, 72) which 
yield the best combination of the important properties. If the properties 
are simple, like the lock frequency (Na), it is easy to find the best 
design. 


9.1 Filter Plot 


Unfortunately, most of the properties of the phase-controlled oscil- 
lator turn out to be complicated transcendental functions of the design 
parameters 7, and 7.. Therefore we have presented many of the prop- 
erties as contour curves on a plot of 7; vs. 72, which we call the filter 
plot (Figs. 5, 7, 8, 11, 18, and 15). Most of the properties are normalized 
through division by the gain constant a. In some cases, the count ratio 
N is also used as a normalizing factor. 

7, and 72 are the time constants of the phase lag filter (ig. 4b), mul- 
tiplied by the gain a. We have plotted 7, and 72 on logarithmic scales, 
to allow the presentation of large ranges. A useful property of these 
scales is that a given percentage change in 7, or 72 appears as a constant 
displacement on the plot. This facilitates estimating the effect of vari- 
ations of the parameters. 
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7, 1s always larger than 7, ; therefore the possible designs are restricted 
to the region below the 45° line on the plot. Points along this 45° line 
are identical, and correspond to the case of no filter. When 72 is zero, 
the phase lag filter degenerates to the RC filter. Since this case is of some 
interest, we have provided a zero 72 axis below the plot and indicated 
the intersection of the various contours with this line. 


9.2 Approximate Relations 


An examination of the filter plots shows that there are large regions 
where the contours approach straight lines. It is possible to derive sim- 
plified formulas for these regions. A summary of these approximations 
and the conditions for their validity is given below. 


; ee Ay 2 T2 
Pull-in frequency: Nea = 4/3 2 (72 > 1) (44) 
Noise bandwidth: — & 4 = (ro°/71 >> 1) (45) 
Ta T1 
- Mio 4L /n 
Figure of merit: N=V3 /2 (r2'/71 > 1) (46) 


Peak error, fre- @@e — 71 2 
quency step: Aw; 72 (2/71 > 1) oe 

Equation (44) has been derived from (29) by A. J. Goldstein.* Equa- 
tion (45) can easily be found from (18). Equation (46) is found by 
dividing (44) by (45), according to the definition (31). Equation (47) 
can be derived from (28). 

These approximations sometimes allow analytic methods to be used 
to find an approximate optimum solution. This requires justification of 
operating in the region where the approximations are valid. 


9.3 Optimization Techniques 


There are several types of optimization methods, which we shall dis- 
cuss in order of increasing difficulty. 

The simplest method optimizes one property by varying one param- 
eter, all other parameters being fixed. This yields a class of designs which 
has one less parameter than the general case. The remaining param- 
eters can be assigned to satisfy requirements on other properties, in 
confidence that the final design will have high performance for the op- 
timized property. 

An example of this approach has appeared in the literature.’ The 
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gain a and the time constant 7, (which together specify the resonant 
frequency) are held constant and the time constant 72 is varied to min- 
imize the noise bandwidth B;. This process restricts the design to 


mtl=vVrn4+1. (48) 


For large values of 7, the damping ratio £ approaches 0.5. Equation 
(48) is plotted in Fig. 20, against the figure of merit contours. 

Let us describe one procedure for designing a circuit using (48). The 
gain a can be set to give the proper lock frequency. Then 7; and 72 can 
be set to give the required pull-in frequency, while satisfying (48). 

This approach is good, and yields rather useful designs. However, it 
does not necessarily produce the best possible design for a given set of 


rs 
aA 
Vaae 


mz oe 














WN 


NCE 

SAAN 

Re 
SK 


AEN 














SGT 
SN 


Ze 


ae 
CA 




















KN ANN 


























PU AISN TE 





/ 


0.1 02. 0.4 0.6 0.81.0 2 4 6 810 20 40 60 80100 
Ty 


el 





Fig. 20 — “‘Optimized”’ designs of Jaffe and Rechtin” and T. Rey,! with the 
figure of merit contours on the filter plot. 


PHASE-CONTROLLED OSCILLATOR 597 


Tor example suppose the lock and pull-in frequencies have been speci- 
fied, with a pull-in to lock ratio of 0.5. Following the above procedure, 
we compare I‘igs. 13 and 20 to find that 7, and r2 should be 18 and 3.2 
to satisfy (48) and have w,/w, = 0.5. From Fig. 7 we find that the nor- 
malized noise bandwidth is 0.19. 

To see that a better design than this is possible, suppose that 7, and 
Tz were 100 and 20. Then the noise bandwidth would be 0.12, a large 
improvement. 

A more powerful technique is possible when some properties are spcci- 
fied by system requirements and another property should be optimized. 
The specified properties are used to restrict the range of the design 
parameters. Then the remaining range is examined to seek the optimum 
design. 

Tor example, suppose that the lock range has been specified, and the 
normalized noise bandwidth is required to be less than 0.2. It is desired 
to maximize the pull-in frequency. A comparison of Tig. 7 and Fig. 13 
shows that the design should lie on the upper part of the 0.2 noise band- 
width contour, and 7; and 72 should be as large as possible. 

The most common problems require a compromise design which yields 
good results for two or more properties. Sometimes it is possible to ex- 
press the relative importance of the properties mathematically. Then 
the optimum design can be derived analytically. A good example of this 
is given by’Jaffe and Rechtin,”” where the desirable properties are low- 
noise bandwidth and a high peak phase error due to a frequency step. 
Their design curve is shown in Fig. 20. 

More often the relative importance of the properties is indistinctly 
known, and the engineer must use his judgment in striking a compro- 
mise. The filter plots are intended to aid this process by giving the en- 
gineer a ‘feel’ for the circuit properties over the entire range of the 
parameters. 


9.4 Numerical Example 


To show how the design aids we have presented can be used in practice, 
we will do a realistic problem. 

A phase-controlled oscillator is to be designed to smooth jitter in a 
1.5 megacycle signal. In the worst case of mistuning, the circuit must 
pull itself into synchronism. We wish to design a circuit with low jitter 
noise bandwidth. 

The uncertainty of the input signal is +107’ or £15 cycles per second. 

The oscillator center frequency is controlled by a frequency deter- 
mining element, which we shall assume to be a crystal, and by the sur- 
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rounding circuit. We take the range of the crystal as +10°° or +15 eps. 
The effect of variations in the circuit will depend on the control the cir- 
cuit has on the crystal, which is in turn related to the gain a. We assume 
that the range of center frequency due to circuit variations is +0.2 
Nra. 

The count ratio N is 4. 

Let us make the following definitions: 

6; — maximum deviation of the signal frequency (rad per sec) 

69 — maximum deviation of the crystal tuning (rad per sec) 

¢€—maximum deviation of the oscillator center frequency due to 
circuit variations, divided by N7a. 
Then the maximum mistuning (which determines the pull-in frequency ) 
is 


Om = Wp = O65 + bo + eNoa. (49) 


If we assume that the final design will be in a region where the ap- 
proximate relations hold, we can use (44) and (45) for the pull-in fre- 
quency w, and the jitter noise bandwidth B; . 

When (44) and (49) are combined, we find 


T2 3 bs + 60 ‘ 
T1 ~ 4 ( Nra + .) : (50) 
Tor this value of 72/7; , the jitter bandwidth is 
3 6s + 50 4 ue 
By = gra (“te + 6). (51) 


Note that the only variable is a. When we minimize B; by varying a, 
we obtain 





= bs + 60 
“Nae ? 
oe 3€, 
7 (52) 
Wy = 2(6, + do), and 
3 (6s + doe 
B; = > N- 


When the numerical substitutions are made, we have 


PHASE-CONTROLLED OSCILLATOR 599 


a = 75 rad/sec per radian, 
T2 = 
ie 0.12, (53) 


®» = 377 rad/sec, or 60 cps, and 
B; = 14 rad/sec, or 2.25 eps. 


Now we have not yet completely specified the design, because we only 
have the ratio of rz and 7,. We can be confident of the numbers above 
for any value of 7; , as long as we have the proper value of 72/7, and as 
long as we stay in the region where the approximate relations are valid. 

If we make 7, very large, we will require a very long time constant 
in the filter. Therefore we will make 7; just large enough to satisfy the 
condition for the approximate noise bandwidth relation, r:°/7, >> 1. 
Let us set 72°/71 = 4. Then, from (53) 


2 = 33, 
i ited 275, 
T. = = = 0.44 sec, and (54) 
T, = — = 3.67 sec. 
Qa 


If high accuracy is required, the values of 72. and 7; given in (54) can 
be used to find the exact values of w, and B;, instead of using the ap- 
proximate values given in (53). 


xX. CIRCUIT MODIFICATIONS 


A two mode system has often been used” to increase the pull-in fre- 
quency. In this system, a frequency detector as well as a phase detector 
is used; the output of the frequency detector adjusts the oscillator tuning 
until the phase-controlled loop can synchronize. This scheme greatly 
extends the pull-in range, but requires additional hardware. 

Another means of extending the pull-in frequency has been published 
by R. Ley.” Back-to-back diodes are placed across the series filter re- 
sistor R,. When the circuit is in synchronism and the jitter is small, 
the diodes do not conduct. The small signal properties are just as we 
have analyzed them. However, if the circuit is not synchronized, the 
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flickering error voltage will cause the diodes to conduct, shorting out 
R,. This will bring the pull-in frequency up near the lock frequency. 
The major drawback of this method is that large jitter error voltages 
will make the diodes conduct, and be passed on to the oscillator. 
Either or both of these methods may be used to greatly extend the 
pull-in range if the other system requirements permit their use. 


XI. SUMMARY 


Nearly all the properties of the phase-controlled oscillator which 
have appeared in the literature have been analyzed for the case of the 
sawtooth comparator and the phase lag filter. 

New theoretical material has been introduced on the effects of fast 
noise and jitter. 

The sawtooth ees has advantages over the sinusoidal com- 
parator for many applications. The reason for this is that the gain of 
the sawtooth comparator remains constant over a broader range of 
operation. 

The properties of the phase-controlled oscillator are presented in a 
manner which facilitates design without unnecessary restrictions. Vari- 
ous methods of design are discussed, and numerical examples are pro- 
vided to illustrate the methods. 
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GLOSSARY OF IMPORTANT SYMBOLS 
A Laplace transform is denoted by capitalizing the symbol. 


B; jitter noise bandwidth 
B,; interference noise bandwidth 
By demodulator noise bandwidth 
f(¢e) comparator function 
fi(¢eo) effective comparator function 
G(w) any normalized noise transfer function 


H(s) 

a1 

Oe 

a3 

A = A1A203 
_. “p 
Lue, 
N 
T; 
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filter transfer function 

de gain, comparator 

de gain, filter 

frequency to voltage ratio, oscillator 

open loop de gain 


figure of merit 


count ratio 

large filter time constant 
small filter time constant 
settling time 

flicker time 

comparator output voltage 
oscillator input voltage 
interference noise density 
signal voltage amplitude 
modulating voltage 

jitter transfer function 
peak jitter gain 


damping ratio 


input phase 

change in input phase 

output phase 

phase error 

peak phase error 

phase error margin 

short-time average phase error 

phase error due to fast jitter 

root mean square of 9; 

large filter time constant (normalized) 
small filter time constant (normalized) 
input frequency 

change in input frequency 

maximum frequency shift 

mistuning frequency 


natural frequency 


lock frequency 
pull-in frequency 
seize frequency 
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Analysis of the Phase-Controlled Loop 


with a Sawtooth Comparator 


By A. JAY GOLDSTEIN 
(Manuscript received October 18, 1961) 


Because of the recent interest in phase-controlled oscillators, a discussion 
of the phase-controlled loop with a sawtooth comparator ts presented. The 
main emphasis is on finding the pull-in range of the loop. A companion 
paper in this issue (Ref. 4) deals with applications and shows how design 
parameters can be obtained from results developed here. 


I. INTRODUCTION 


The phase-controlled oscillator has evoked much interest in recent 
years. Some of its applications are to synchronism in television,’ syn- 
chronization to a harmonic of a crystal oscillator,’ elimination of jitter 
in pulse code modulation,’ tracking filters, ete. 

The general phase-controlled oscillator loop is given in Fig. 1. The 
incoming signal and the variable oscillator have the same free-running 
frequency w,. The phase comparator has as its output some function f 
of the phase difference 9. = 9; — ¢o. As examples of f(y.) we have 


the linear case: _ flee) = Ge 
the sinusoidal case: f(g.) = sing, 
the sawtoothed case S(¢e) = Ge for -§ Sy e 
(see Fig. 2): Terr NA) = Fee) tor. ay ete LO se 2 


The output of the phase comparator passes through a filter whose im- 

pulse response is h(¢). The output of the filter v(¢) controls the variable 

oscillator according to the equation 
dgo 


aL _ av(t). 


cia. 
je 
sY 
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PHASE F(~i- 0) Jeter! Vit) | variate | SIN (Wct +0) 
COMPARATOR h(t) OSCILLATOR 


Fig. 1 — The general phase-controlled loop. 





SIN (Wot + ¢i) 






Thus, the frequency of the controlled oscillator is 


We “ = w, + av(t). 

In a companion paper in this issue, C. J. Byrne* discusses the engineer- 
ing origins and applications of the sawtoothed comparator and shows 
how design parameters can be obtained from the results of this article. 

This article is primarily concerned with finding the pull-in range of 
the loop. This is defined precisely in Section III. Briefly it is the maximum 
asymptotic (in time) value of the mistuning deg;/dt for which the slave 
oscillator eventually synchronizes or locks to the input frequency. All of the 
literature cited in the references deals with this problem for the case of 
a, sinusoidal or linear phase comparator. The linear case is easily solved 
since the resulting differential equation is linear. (See in particular 
Labin’ for a detailed discussion.) In the sinusoidal case the differential 
equation of the system is nonlinear. Only in the cases of no filter and an 
ideal integrator has the equation, up to the present, been solved in closed 
form. See Labin’ for an excellent discussion of the no-filter case. In order 
to handle the nontrivial filter, many authors have used methods of 
phase plane analysis.°”"* Phase plane analysis is restricted to the prob- 
lem of capture range in which the mistuning and phase error are zero 
for negative time, and the mistuning is constant for positive time. This 


F (9) 


Pe 


nia 


Fig. 2— The sawtoothed phase comparator characteristic. The phase error 
¢. is difference between the input and output phases of the loop. 
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kind of analysis gives only upper and lower bounds for the capture 
range and is restricted to a lag filter (Fig. 3). For an RC filter (R, = 0), 
Barnard® shows how phase plane analysis can give exact results. 

To obviate the mathematical complexities, people have resorted to 
making various hypotheses about the nature of the solution of the non- 
linear differential equation. These assumptions are based upon physical 
intuition and gross behavior observed in the laboratory. Different as- 
sumptions have led to different approximate solutions for the capture 
range. Moreover, they deal primarily with the lag filter, since it leads 
to a second-order differential equation while a more general filter gives 
a higher-order differential equation. 

The loop equation when expressed as an integral equation is 


dy; 
dt 








Ui a [ flee(t)ACt — t’) dt’ + 


di =— avy(t). 


It is surprisingly tractable for the sawtooth comparator, and the pull-in 
range can be computed for any filter. Fig. 4 shows the excellent agree- 
ment between theory and experiment for the lag filter. These experi- 
mental results were obtained by C. J. Byrne. 

To obtain our results, we too must make an assumption. While the 
assumptions other authors have made deal with the behavior (in steady 

state) when far outside the pull-in range, ours deals with the behavior 

just outside of the pull-in range (see Section 4.4). This hypothesis is 
easily verified experimentally and has been so verified by C. J. Byrne 
for a representative selection of RC filters. 

A brief description of each section follows. 

Section IT gives the basic integro-differential equation of the loop. 

Section III defines the lock and pull-in range. The former is called by 
some the pull-out range. The lock range is the maximum frequency 
difference that the loop can lock to. It is given by 


On, = of maxld (0) 


Fig. 3 — The integral compensating or lag filter. The normalized time con- 
stants are 7 = a(t; + R2)C andr. = aC. For an RC filter 7, = 0.a = (V.F.O. 
output frequency shift )/(V. F. O. input voltage). 
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Fig. 4 — The relative pull-in range. For critical damping (72 + 1)?/4 = 7 and 
for the RC filter R, = 0. 


where H(0) is the de gain of the filter and fax is d/2 for the sawtooth 
comparator. 

Section 4.1 gives the solution of the basic loop equation. This solution 
is the sum of (1) the solution of the linear phase comparator problem, 
(2) a series of step functions, and (3) a series of damped exponentials. 
The solution is obtained by representing the phase comparator function 
as the sum of the phase difference [giving (1)] and a series of translated 
unit step functions [giving (2) and (3)]. 

Section 4.2 gives the steady-state solution when not captured. In this 
case the output of the phase comparator is a periodic function whose 
period for a fixed filter depends on the asymptotic relative mistuning 
(Tig. 5). By examining this non-capture situation we obtain the pull-in 
range. We observe that in non-capture state the period and relative 
mistuning must correspond to a point on a curve typified in Fig. 5. 
Hence a relative mistuning lying below the minimum point of the curve 
corresponds to a capture or synchronized situation, and the height of 
the minimum gives the ratio of pull-in to lock range (the relative pull-in 
Yp)- 

Section V gives all the explicit design formulae for the lag filter. For 
the special case of the RC filter (R. = 0 in Fig. 3) an explicit formula 
for relative pull-in can be given, namely 
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tanh = (aRiC — 4) (aR,C = 4) 
Yo = 4 
1 (aR,C < +). 


In all other cases we must find the roots of a transcendental equation by 
numerical approximation methods. 

Byrne’ gives graphs of the results of Section V for the lag filter. These 
are graphs of relative pull-in (Fig. 13), noise bandwidth (small signal) 
(lig. 7), figure of merit (relative pull-in/noise bandwidth) (Fig. 15), 
and maximum loop gain (small signal) (Fig. 8). 

The noise bandwidth is a measure of the ability of the loop to reject 
small phase noise. More explicitly, the noise bandwidth N of a network 
is defined to be the bandwidth of that ideal low-pass filter which passes 
the same white noise power as the given network. 

There are many possible ways of defining a single measure of the 
performance of the system, depending on the particular application in 
mind. We have chosen the figure of merit y,/N, Le., a large figure of merit 
implies high noise rejection and large relative pull-in. 








T= PERIOD —> 


Fig. 5 — Relative mistuning w,/w, in a non-synchronized steady state vs the 
period T of the comparator output. (a) no filter, (a) and (b) overdamped loop and 
(ec) underdamped loop. 
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For small phase deviations of the input, the comparator can be con- 
sidered linear. We can then discuss the gain of the loop as a function of 
the frequency of the phase deviation. The maximum of the loop gain is 
denoted by Y. In some applications Y is restricted by stability considera- 
tions to be less than unity. 

Section VI is devoted to the derivation of several interesting asymp- 
totic results for the lag filter. A simple formula is obtained for the 
relative pull-in for large values of the filter time constants. It is also 
shown that if the maximum loop gain is allowed to have a fixed value 
greater than unity, then, by appropriate choice of the time constants, 
arbitrarily large values of the figure of merit can be obtained. 

This work could not have been completed without the aid of M. 
Karnaugh who suggested the problem, E. G. Kimme who proved that 
the sawtooth comparator is a continuous approximation to the original 
discrete sample data system, C. J. Byrne whose experimental work con- 
firmed the formulae derived here, D. I. Rowlinson who constructed the 
contour curves from the computer data, and R. D. Barnard with whom 
many fruitful discussions were held. 


II. THE BASIC LOOP EQUATION 


We obtain an integro-differential equation for the loop by noting that 
the output of the filter can be written as a convolution plus initial condi- 
tions 


ot) = | fleet = ¢) dt! + (0) 


where v(t) is the filter output due to residual charges and fluxes in the 
filter at time zero. vo(t) damps out exponentially in all filters of interest. 
Substituting this into (1) and replacing g by g; — g. we obtain 


dy; 
Tia avo(t). (2) 








oe = -a | flee(t a(t — t') dt! + 


In order that the derivations which follow not be unduly complicated 
by inessential parameters, we make the following normalizations 
x(t) a wih) CF imax = d/2) 
C(a(t)) = f(ee(t) ) /fmax 
g(t) = ¢i(t) /fmax . 
The normalized form of (2) becomes 
t 
= a / Cle(e')Mn(e — ¢') dt +2 — avo(t)/fae (8) 
dt o dt 
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Ill. DEFINITIONS OF LOCK RANGE AND RELATIVE PULL-IN 


If the input frequency w, -++ dy;/dt is increased ‘‘very slowly” to a 
value which is not too large, the output frequency w, + dg/dt will 
follow it (1.e., be always equal to, or locked to, the input frequency). 
The maximum value of dy,/dé for which lock-in will occur is called the 
lock range and is denoted by w,. More precisely, w, will be determined 
from (1) when the maximum de voltage v is obtained. This maximum 
value is clearly the product of a, fax the'maximum value of the com- 
parator function f and H(0) the de gain of the filter.* 


wr, = ofmaxH (0). . (4) 


Suppose that the input frequency is not increased slowly, but in some 
sudden or erratic manner. Suppose moreover that the input frequency 
approaches a limiting value, wn, the mistuning; Le. 


. dg; 
] t 
tes dt 





= Wm. 


In general, even if 0 < wm < wy, (that is, we are in the lock range), the 
output frequency will not asymptotically lock to the input frequency 
(that is, be captured), but will be a modulated frequency. We define 
the relative pull-in range y» to be that normalized maximum frequency 
difference such that 


dg; 





—Y,po, < lim = Wm < YpWz (5) 
t>0 dt 
implies 
.  dgo 
] SS me 
pe (6) 


Notice that we make no restriction on how d¢/dt approaches w,» , as 
long as | am | < Ypwr. 


IV. DERIVATION OF RESULTST 


4.1 Basic Equation 
Let 


Ob lg eS Ta SES 


* We shall use capital letters to denote the Laplace transform of the function 
denoted by corresponding lower-case letters. 
{ From here on we are dealing with the sawtooth comparator. 
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be all the instants (called discontinuity points) at which the phase 
difference x(t) crosses the discontinuity of C, ie., 


lim z(t, — A) = 2(tz—) = 1 + 2n’ 
A> 


where the first equality is a definition of x(t,—) and where n’ is an 
integer dependent on n. Let 


a, = 1 if x is increasing at ¢t, 
a, = —1 if x is decreasing at t, 
a, = Oif x is stationary at ¢, . 

Using the unit step function 

0 

0 


0 for t< 
u(t) = 
1 for t= 


we can express C(a(¢)) in the analytically useful form 


Cat) tO) Sa 
Se to 2, oult t;) (7) 


where 7 is an integer so chosen that this equation holds at ¢ = 0. 
We note here for future reference that 


t(tn»n—) =m &+E+ Di a; , (8) 
7= 
Substituting (7) into the loop equation (8), we obtain 


ldz a : , ! , ; , / 
sq 7 -¢f a(t)h(t — t') de + any | A(t’) dt 


ie) t 
+a), a; { u(t’ — t;)A(t — t') dt’ (9) 
7=0 0 
lde 
+ 3 dt ae avo(t)/2fmax . 
Solving this by Laplace transform methods we obtain 


sb(s) — [eo(0) + aVol(s)I/fmax , mo aH (s) 
2(s + aH(s)) s s+ aH(s) 


“aj i aH(s) 
2, s s+ aH(s)’ 


zX(s) = 
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Letting 
1 
we have 
i ers) = ee (11) 


Note that sR(s) is the transfer function between input phase ¢; and 
comparator output phase ¢, for the linear comparator case. r(t) is then 
the phase response at the linear comparator output due to a step in 
input phase. Since applications will require the system to synchronize 
to a step in phase, we will assume that r(t) ~ Oast— o. 

Using this equation and taking inverse transforms in the equation for 
X(s) we obtain 


eS = a + no(1 + r(t)) + Dats =) ¥ aptt — &) (12) 


where 


Le (s )= sP(s) an [¢0(0) + aVo(s)]/fimax 
s + aH(s) ; 
x1(t) is the solution of the loop equation in the case of a linear com- 
parator function f(z) = x. 
Using the final value theorem” we have 


x,(o) = limz,(t) = lim sX,(s) = lim gy’ (t) 





pa >" aH (0) 
(13) 
3 aH(0) wx 
From (12) and (7) we have for the comparator output 
CODY ak) — DY ajr(t — t;) + nor(t). (14) 
2 oe j=0 
In a steady-state condition this reduces to 
C(a(t)) = 2 - 2d an(t — t;) (15) 


where the npr(¢) term vanishes because of the remarks following the 
definition of R(s). 
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4.2 Steady-State Solution When Not Captured 


When we are not locked and in steady state, the output of the phase 
comparator will be a periodic function.* We give here a simplified 
heuristic derivation of the steady-state periodic solution. A rigorous 
derivation is easily obtained using the heuristics as a guide. In steady 
state, the normalized comparator output y(t) = C(x(¢)) will be periodic 
with a period which we will call 7’. In a given period there may be many 
discontinuity points t; ; let us suppose there are k. Then assuming we 
are in steady state, we can write 








. “ n= --+- —1,0,1,--- 
tneqe = NT + 7; + 7, Cee (16) 
where 
0O=%< ,<::: < Th < T. 
These relations are illustrated below. 
bn tenk 41 ae cee plnk+-k—t \bin + 1k 
nT +37 mT +My trooeese nT t+ Thite (n+ 1)T +37 
| T — —-——_—_—_ 
The a,’s will be periodic in steady state and we let 
=... —10,1,--- . 
Anti = A; ( = 0,1,---,k—-—1° (17) 
It is no restriction to assume a time shift so that 7 = 0. Then, let 
-  tamT+u(0<us 7) (18) 
and combine the above three equations with (14). We obtain 
y(t) = Cla(t)] = Cla(mT + wu)] 
= Wm/ Or, — 2 »S Oneyat (mT + Uu— bnt+a) 
k-1 m 
= tim/w, — 2 >> A; dor[(m — n)T tu — Ti). 
7=0 n=—0 
(The second summation has the upper limit m because r(¢) = O for 
it < 0.) Letting 7 = m — n, we obtain 
k—-1 co) 
y(t) = "=" -2 A; Ver(jT +u — 7). (19) 
OL i=0 j=0 


* A mathematical proof is not at hand. Indications of its truth are given in 
’ Benes® and experimental observations confirm this. 
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Let us define a periodic function 


p(T) = 2erlt +jT) (0<tsT) . ae (20) 
p(t—nT,T) (nT <tS (n+ 1)T) 


or 
+00 
p(T) = Lert + JP). 


(r(t)) = O for ¢ < 0 makes p(t,7’) a well defined function.) With this 
definition, the normalized steady-state comparator output, when not 
locked, can be written 


y(t) = = 25 Aplt an); (21) 


The expression for p/( is familiar to those in the field of sample 
data systems.* Though superficially formidable, it can be expressed in 
closed form quite easily for the only important class of the filter transfer 
functions H(s), namely rational functions. In that case R(s) isa rational 
function too. Hence r(t) is a linear combination of exponentials of the 
form ¢”e* (real part of B negative). Then p(t,7) for0 <t < Tisa 
linear combination of geometric series, each of the form 


2(t) = >> (e+ gTyrer™ 
j=0 


= ad" ae B(t+ 97) (22) 
ap” j=0 
d™ 6 

~ dpm 1 — oF 


This steady-state solution consists of a constant term 2w,/daH(0), 
which is the normalized steady-state output for a linear phase com- 
parator plus a linear combination (with coefficients +1) of time trans- 
lates of the function p(t,7’), which is periodic of period 7. The derivation 
shows that every steady-state periodic solution of the loop equation has 
the form of (21). 

Equation (21) hides several pitfalls. These are: 

1. We must have | y(é) | < 1. Hence only certain 7 and T; are 

admissible. 


poo 
* It is the response of a filter R(s) to an input >) 6(¢ + jT). 
j=-—e 
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2. Are the solutions represented by (21) physically realizable? 
3. Are the solutions represented by (21) stable with respect to small 
noise perturbations? 
These three topics are grouped under the title Boundary Conditions 
and will be discussed following a discussion of the pull-in range. 


4.3 Relative Pull-in 


From the definition of T, y(7— ) = +1 and by an appropriate choice 
of 7 in (16) (if w, > 0) we may assume y(7’) = 1. Then from (21) 


k—1 
: =1+2> A(T — T;,T). (23) 
L t=0 


Now the minimum value of w,, > 0 for which we have a non-constant 
periodic steady-state stable solution is by definition y,w, , hence 


k—-1 
Y¥p =1+2min >> A(T — T;,T). (24) 
i=0 
where the minimum is taken over all 7 and over all steady-state solu- 
tions satisfying conditions 1, 2 and 3 above. 


4.4 Boundary Conditions 


4.4.1 Discontinuity Point Condition 


y(t), being the normalized phase comparator output, satisfies —1 < 
y(t) <= 1. Also y(t’/—) = +1 if and only if for some n and 2,’ = T; + 
nT, or y(t) is stationary at ¢’ (ie., y’/(t’) = 0 and y at ¢’ is increasing if 
y(t’) = —1 or decreasing if y(t’) = 1). These are equivalent to 


DAdp’ — 7;,T) — p(T — 7T;,T)) =0 


if and only if = n7 + T; or y(t’) — y(T) is stationary. at ¢t’. This 
restriction will be called the discontinuity point condition. 

To analytically determine whether this condition is satisfied, in a 
general case, is clearly very difficult. For the case of the lag filter we 
can solve the problem analytically but must rely on an experimental 
fact. C. J. Byrne has found experimentally, in a large class of RC filters, 
that there is just one discontinuity per period 7, i.e., the k in (21) is 
one. We will call this the Experimental Hypothesis. Thus 

y(t) = = — 2p(1,7) (25) 
OL 


and 
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Yp = 1+ 2 min 7(T,T). (26) 


In the section on the lag filter we show that if 


p(7",7’) = min p(7,T) (27) 


then p(t,7T’) satisfies the discontinuity point condition. Thus if p(t,7’) 
is realizable (it is — see below) and is stable under noise (we do not 
know, but have some evidence — see below) then 


Yo = 1+ 2p(T"",T") (28) 
for the lag filter. 


4.4.2 Realizability Condition 


Does there exist, for each of the steady-state functions represented 
in (21) satisfying the discontinuity point condition, a corresponding 
input function g(t)? That is, are the y(t) in (21) physically realizable? 

In Appendix A we prove realizability for any filter but not in quite 
the form stated above. We do the following: 

(a) A particular input ¢(t) = 2w,,/d is injected. 

(b) The loop is broken at the output of the phase comparator. 

(c) Into the filter, at this point, is injected a voltage which asymptot- 

ically has the form (21). 

(d) One shows that the output of the phase comparator has asymp- 

totically the same form. 

(e) In steady state the loop is closed. 


4.4.3 Non-Synchronous Stability 


Are the solutions stable? By this we mean: Will a steady-state solution 
be thrown into synchronism by a “small” noise? In formal terms, we 
suppose that a solution y(¢) has a discontinuity point, say t) shifted by 
noise to to + Ay. Each of the following discontinuity points t1,h,--- , 
tn, °** 1s Shifted to 4, + Ay, ft + As, ---,t, + An, ---. It suffices for 
our purposes that the (¢, + A,)’s be asymptotically periodic (i.e., the 
noise sends us into another periodic solution and not into synchronism). 
The best we have been able to prove is that 


lim E | = 0 << 
Ay=0 


n> dAo 
This has been done for the lag filter using the experimental assumption 
that k = 1 and that 7’ — e < T < 7”, for ¢ sufficiently small, where 
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T’ is given in (28). Now it would suffice for stability to show that A,, is 
bounded for Ao sufficiently small, but the above does not imply this, for 
all it says is that 


An — cAy + €,Ao 


and we do not know that e, is bounded. 
V. LAG (INTEGRAL COMPENSATING) FILTER 


5.1 General Results 


This section gives all the explicit formulae for design procedures in 
the case of the lag filter (I’ig. 3). We assume the experimental hypothesis 
(see Section 4.4) throughout this section. 

The transfer function of the filter is 








= tos + 1 
NS Baar 
where 
h = (Ri alr R2)C. 
lo = RC 
Hence 
R(s) 2: 1 as his + 1 
s+taH(s) ts + (ak+1)s+a 
1 1 
Pi a i, 1 P2 as by 1 


where p; and p, are the roots of denominator of R(s). In particular, 
introducing the normalized dimensionless time constants 


Ti = at;, 7=1,2 


we have for the roots 


Di = (a+ (—1)') 


where 


(72 te 1)/2 = 3 


2 2 
=a — 74 


a 


* The real or imaginary part of b is non-negative. 
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The denominator of R(s) can be written in the form | 
8° + Qunks + wn” 
where 
wn = (a/t1) 
and £, the damping factor, is 
E= (mt 1I)/2Vn = a/(a — 0°) 


In this notation we obtain 
r(t) = [-(a = b = 1) exp (—(a = b)t/h) 


+ (a + b — 1) exp(—(a + b)t/t,)]. 


Because r(¢) is a linear combination of exponentials, we can easily sum 
the infinite series for p(t,7), obtaining 


eee ee ee ee ee exp[—(a — b)n’] 
p(t,T) = p(n',n) = Al (a b 1) 1 — exp{—(a — b)al 











[—(a + b)n’] oe 
exp|—(a n 
b— 1) —__—. 
$0 +b— 1) 
where 7’ = t/t, and » = 7'/t, are dimensionless time variables. 
To obtain y, using the results of (27) we must find 
min p(7,,T') 
T 
or the roots of 
_ ap nn) 
0 = dn ° 
Differentiating the expression for (7,7) we obtain 7 ~ 0 and 
sinh” (a — b)n/2 _ (a — b)(a — b — 1) (30) 


sinh? (a + b)n/2 (a+ b)(a + b — 1) 


or 7 = o. And upon using the addition formula for the hyperbolic sine, 
we have 


2 2 
tanh a7/2 , b tanh bn/2_ ,a +b —a_ 2c (31) 
l tanh an/2 2a — 1 
5 tanh by/2 


which defines c, or 7 = ©. 


618 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


Use of the quadratic formula gives 
see SY Vamp Sea) (32) 
3 tanh bn/2 


or 7 = ©.* In special cases considered it was found that the minimum 
of (n,n) occurs at the first positive zero of its derivative (or at 7 = ©). 
5.2 Critical Damping 


From (31) we see that as b approaches zero (damping factor equals 
one), 


tanh an/2 _ 9 Ha — 1) — 1) =¢(a,0), if a>1 


n/2 2a — 1 (33) 
n=o, if ¢<a<il. 


? 


Thus y, = 1 forb = Oand$ <a< 1. 


5.3 No Filter and RC Filter 
The filter parameters satisfy 
OSnsn 
which upon conversion to the a and b parameters become 
(a-1/ 20 
and 


a 


IV 
vie 


Equality holds in the first case, when R; = 0 or C = O (ie., there is no 
filter) and in the second case, when Ry = 0, (i.e., a simple RC filter.) 
For no filter,a + b — 1 = Oora — b — 1 = O, and referring to (30) 
we have only 7 = ©. Thus min p(7,7) = O and yp = 1. 
For the RC filter R. = 0, a = 4, we obtain from (30) 7 # 0 and 
sinh by» = Oorn = ~. If bisreal, 7 = © andy, = 1. If b is imaginary 


n = mr/b m = 1,2,--- 


* If the negative sign were used in the quadratic formula then » would be nega- 
tive (complex) when b was imaginary (real). 
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and we easily find that p(m/b,mr/b) is minimum at m = 1, giving 
finally 


\ 


ve, 
tanh 7 (« —_ i) if 71> 
Yo = . (34) 
. 1 
1 if Ti < 4 
The results of these special cases are graphically summarized in Fig. 


6. (Also see Fig. 18 of Byrne, Ref. 4.) In the shaded area of Fig. 6 the 


D-REAL 
—> 


b-IMAGINARY, 
b/-1 





< 


%=72 __ 
(NO FILTER) 


UNDERDAMPED 
LOOP 





{e) oO. 
25 1.0 ra 


Fig. 6 — In part (a) the parameters a and b are restricted to lie below and/or 
to the right of the polygonal curve. The heavy lines and the shaded area give 
values of a and b for which the relative pull-in is unity. In part (b) the same in- 
formation is given for the normalized time constants 7; and 72. 
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relative pull-in is unity. This follows from the fact that the left-hand 
side of (31) is bounded below by 2b* while 
2c — 2b = 2(a — b — 1)(a — b)/(2a — 1) 


is negative in that region. Hence in (31) we must have n = ©. 


5.4 Computational Procedures 


Except in the special cases of no filter (R; = 0) and the RC filter 
(R, = 0), there is no simple way of computing the relative pull-in. We 
must solve (32) by an iterative procedure and substitute the result into 
the equation for j(y,7). If 7 is the solution of we) or (33) we have a 
simpler equation for y, , namely 


= [1 — D sech’ an/2]/tanh an/2 
where 


a-l1leg-—? 


aA 
D= oe (b #0) 


and 
D= (a—34)/a (b=0). 


An upper bound for 7 is obtained from (32) and (33). Using the fact 
that tanh « < 1, we obtain 


(2(tanh' b/a)/b (b ¥ 0) 


\2/er (b = 0)’ ie) 


A lower bound for y in the case 0b is real is obtained by using the in- 
equalities 


z— 2/3 S tanhz <z 
Using this in the equation for 7 we have 


( “tanh by/2 
an/2 — (an/2)°/3 S tanh an/2 = ae) b < cin/2 


(er(a,0)n/2 


ae <4. 


* The left-hand side of ea is of the form b(x + 1/z). For x positive this is 
bounded below by 2b. 


giving the lower bound 
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We note here for future reference that if b is imaginary, b = 7b’ then 
(35) implies the inequality 


0 < nb’ < =. fy . (36) 


5.5 Discontinuity Point Condition for the Lag Filter 
To prove that this condition is satisfied, 1t suffices to show that 


OF 20 for O<t<T. (37) 


Tor, since we may suppose 
HPS, 
it follows that if 


yy Sh AOS ee) 


then Rolle’s theorem tells us that there exists at” with t@) < ” < T 
such that y’(t”) = 0. This contradicts (37). It suffices also to prove 
(37) for that 7 which minimizes p(7,T): 

Recall that we are assuming we have a lag filter and that k = 1 in 
(21) (experimental hypothesis). Assuming (37) false, we obtain from 
(29) after some calculation 


erate’ Catal? ag 
e— (ab) 1/2 = 1 — e—(a-b)n/2 (38 ) 
where 7 minimizes j(u,u). Note that 0 < 7’ < 7. 
Case 1, b real. Then a > b and. 
eer 


e— (ab) 9/2 s 


= e (a+b) al? 


= e—(a—b)n/2 


L— eo atne x 


> {= em (a—b)n/2° 


Hence (88) is false. 


*TfO<a<y <1, then z/y > x — 1/y — 1, for —x > —y implies zy — + > 
xy — y; hence in factoring and dividing we obtain the desired inequality. 
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Case 2, b imaginary. Let b = 7b’, then (38) becomes 
—b'n!/2 + mm = arg(e Or? — 1), (39) 


Now the real part of eT"? — 1 is negative and the imaginary part 
g 


is negative (since by (36), 0 < b’n/2 < 2/2). Hence the right-hand 
side is an angle in the third quadrant. But the left-hand side is an angle 
which can only be in the second or fourth quadrant, since 

0 < b'n! < b'n < =. 


Hence (39) is false, proving the discontinuity point condition. 


5.6 Small-Signal Properties of the Loop 


In this section we give formulae for design parameters of the loop 
when we are operating on the linear portion of the phase comparator. 
Then the closed loop transfer function Y is 


a(tes + 1) 
St, + (ab +1)s+a’ 


Restricting our attention to real frequencies and normalizing the fre- 
quency w by 


Y(s) = 


Q = w/a 
and recalling that 
m1 = at, 7 = ate 
we obtain | 
m2 +1 


(hos he ee ee 
ON = Gap ar + = Pay 


With the phase shift 


(1 + ri720°)Q 
(1 rsd 70?) + T2(1 + T2)?0? 


‘ if denominator positive * 


6 = —arctan 


x if denominator negative’ 


Important parameters for design are the maximum gain and the 
frequency and phase shift at which it occurs and the range of frequencies 
for which the gain exceeds one. Differentiating | Y(@) |? and solving for 
its zero gives 


* The arctan is an angle in the first or fourth quadrant. 
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[ (27; = 1)/2r, if 2 = 0, T1 
Qmax = \{[l + (2011 — 72) — 1)r’/7/] 
0 if mn—7m <i. 


Solving | Y(Q) |? = 1 gives 


IV 


i 
2 


Vim at a= 4 


| 


IV 


Q? < 22 
where 
2(11 — t) — 1 
: 2) if m1 —m24 
QQ: = 71 ; 
0 if Ti “To? < 4 


We also have the interesting inequality 
V/2 Qiiax s Q, 


with equality when 72 = 0. The cases 72 = 0 and 7, — 72 S 3 are im- 
mediate. The case 7, — 7t2 = 3 gives 


Ones: = fat oe 72%] — 1} /72" 


~ fl + 72027) + 1 
< & 
ae 


proving the result in this case. 
We wish to emphasize that the maximum ‘gain is unity if and only if 
71 — T2 S 3. Peak gain = constant contours are given in Fig. 8 of Ref. 4. 
The 3 db point occurs at 2 = Q: where 


| ¥(q) P= 3 7 
from which we obtain 

QO? = B+ (B+) - 
where 7 


b= (72° + 2(71 =. 72) — 1)/272". 
The noise bandwidth JN is defined by’ 


N= ff [Y(o) Paw. 
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It can be evaluated in various ways, for example-see Ref. 10. One obtains 
N = rae(1 + t2/71)/2( 72 +1). 


In the no-filter case (72 = 7; = 0) and RC case (rz = Re = 0) we have 
N = ra/2. N = constant contours are given in. Ref. 4, lig. 7. 

As discussed in the introduction, the figure of merit was chosen to be 
the ratio N/y,.N/y» = constant contours are given in Ref. 4, Fig. 15. 


VI. ASYMPTOTIC RESULTS 


In this section we obtain the asymptotic results stated in the introduc- 
tion. Since the derivations are tedious, the results are first summarized. 

From computer data, the contour curves of relative pull-in y, = 
constant with ordinate and abscissa the normalized time constants 


n = a(R, + RC 
2 > aksC 


seem to be asymptotic to straight lines for large values of the normalized 
parameters. (See Fig. 13 in Ref. 4.) This observation led to the con- 
jecture that for fixed yp and large 72 


T= K(t2 + 1). 
In Appendix C we prove this and show that 
1/K =1— (1/%) - Yp) (tanh Yr) 


With respect to the figure of merit (see Fig. 15 in Ref. 4), the following 
very important results are derived in Appendix B for the lag filter. Sup- 
pose the peak small-signa;phase gain Y of. the loop is restricted to be 
unity (it is always unity at de). Then the maximum merit obtainable 
for filters giving the unity peak loop gain is 2.27. If, however, we permit 
a fixed peak gain greater than unity, we can have an arbitrarily large 
merit figure. This usually results in very poor transient response. More 
precisely, the following results are derived in Appendix B. Let us con- 
sider those lag filters for which the peak small-signal (phase) gain is 
fixed at Y. Define M by 


M=1-Y° 


Then for a filter with normalized time constants 7, and 7. and normalized 


a 


frequency Q = w/a, for which the loop has peak gain Y occurring at 
frequency Qmax , we have 
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Dae = M/r 
and 
m= (Mr? + 27. + 1)/2(1 + M). 


Asymptotically for rz, large we obtain for the noise bandwidth (with 
a = (tr + 1)/2) 


N/ra = (1 aa 2 10) / 40 + 0(a™*)* 


and for the relative pull-in range 


ee [ — 
Yn = a 4 —— VM — 1 
V3 M 


: (" + ‘) + O((72/71)#). 


+ O(a) 


V3 
Thus the noise bandwidth decreases as a while the relative pull-in 
decreases as a *. Hence the figure of merit increase as a’. 
The derivations of the preceding results are given in Appendices B 
and C. 





1 


APPENDIX A 


Realizability of Steady-State Solutions 
Recall that (assuming d = 2) 


oo 


y(t) = HO) —2 3 A; 2 r(t — T; — nT) (40) 





where [see (13)] 


Wm 


~ @H(0) ’ 


t= r1( 0) 





Since we assume y(t) satisfies the discontinuity point condition 


Wm 


k-1 


Break the loop at the output of the phase comparator, inject y(t) 





* Two functions f(x) and g(x) satisfy f(x) = 0(g(x)) if and only if | f(z)/g(z) | S 
constant < © for x sufficiently large. 
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into the filter, and let the input phase be w,¢ + x, — c (where c is 
defined below). The phase output of the oscillator is given by 


del) <a y(n — 0) at 
dt 0 





and upon integrating once and substituting (40), 


g(t) = i) [ [ h(t”) dt” at’ 
=o > A; > [ i ar(t” — T; — nT)h(t’ — t”) dt” dt’. 


By taking the Laplace transform of the double integral in the summa- 
tion and by using the relations in (10) and (11), we find 
t pt’ 
h(t”) dt” dt’ 
0 


po(t) = HO) I 


eo 


= 25 Add ule - T; —nT) — >> r(t — T; = nt), 


n=0 


ape 


Now the remaining double integral is the integral of the step response 
of the filter and for large ¢ is of the form H(0)t + c. Using this and the 
definition of y(t), we obtain for large ¢ 


k-1 


pot) ~ wnt +e — 2 2, A: 3 Yule 7 — nT) — y(t) + 2,. 


Now using the discontinuity point condition and the representation of 
the comparator in (7) we find the comparator output is asymptotically 
y(t). Hence in steady state we may close the circuit without any dis- 
turbance. 


APPENDIX B 


Figure of Merit for Constant Peak Gain and Large Time Constants (Lag 
Filter) 


From Section 5.6 we have for the closed loop small-signal (phase) 
gain . 
T ae -++ 1 
(ry + 1)? + 1 — 071)?" 





Ly (oy = (41) 


PHASE-CONTROLLED LOOP 627 


Differentiating with respect to 2 and equating the result to zero gives 
7172 Qmax + Ons = [2(71 —_ T2) _— 1] = Q. (42) 


We can also represent the square of peak gain Y’ as the ratio of the 
derivatives of the numerator and denominator of (41) evaluated at 
Dee = 

y? bez T 2 
(ro + 1)? — 274(1 — 7192,,,) 


re — [2(r1 — 72) — Y + 2770?,, 


This, after using (42), gives 


are Se (43) 
Defining M = 0 by 
M=1-Y", 
we have 
0<M <1, sine 15 VY < o. 
Also (43) gives 
Timex = M. 7 (44) 


Substitute (44) into (42) and solve for 7; . Then 
ry = (M?r" + 27. + 1)/2(1 — M). 
Using this result in the formula for the noise bandwidth (Section 5.6), 


A 


we have for Y constant and 72 large (and hence a = (72 + 1)/2 is 
large) 
_ Ta 2(1 — M) 2 
w= 2(1 420 WO) 4 g(a yi (45) 


We now turn to the problem of obtaining asymptotic expressions for 
the relative pull-in range for Y fixed and greater than unity. 
We can rewrite the expression for 7; as 
2M’ , M+1 
— M)a — 5 
Ti fo Mu a + 2(1 + Ja 9 


* If f(x) = p(x)/q(x), then f’(ao) = O implies f(t) = p’ (xo) /q’ (a0). One obtains 
this result by logarithmic differentiation of f(z). 





(46) 
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Using the definition b” = a’ — 7,, (46) and the binomial expansion, 
we have for large a 


b 2M" \3 1—-M1 - 
2 = (1 - ) 0 St ow) (47) 


if M #~ }and 
bY = —3(a — 4) (48) 


if M = 4. In the following we suppose M # 3. Recall (31) that to find 
the relative pull-in we need the root of 


tanh bn/2 
pps 


tanh an/2 = D 
where 
¢ =ce+ (ce —0d*) 
and 
e = (a + Dd’ — a)/(2a — 1). 
Hence 
ec = (2a — a — 1)/(2a — 1) 
és 
= all — n/a(2a- Dl =a — 2 (49) 
eo af MM +00" | 
1—M 
Also 
- 2 
eta - 1) — (a — 1) 
T2 
(50) 
(C=) 
T2 T1 
giving 


ew 2fi-30) 20) +0(2) 
T2 2 7) 8 1 T° ; 


Finally 
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T . 2\h 
a ={a——)+ (ce — db’)? 
T2 


1 ark 
afi p+ 0(a )]. 


ad 


(51) 


Cy 
Setting 2 = an/2, we have 


tanh z = “ tanh 2 (52) 


b A &. 
We will show that for large a, z is small and then obtain an approximation 
to z by using a power series expansion for tanh z. Ifirst note that the 
derivative at zero of the right-hand side of (52) is 


41-1 40%) 
a 2a 


which approaches 1 from below for large a. Also 
q 4a ‘yl 
rae + 0(b-) 


7 OM Nt 4 


Hence | ¢,/b | is bounded away from 1 (and greater than 1). 
A sketch of the curves of the two sides of (52) with the above two 
facts shows that 
lim z = 0. 


a->oo 


Using power series expansions in (52) we obtain for large a 
3 3 
Se als ‘y’) 
3b a 3 a 


2 hes C1/a 
1- cb? /as 


5, + 0(a"*) a8 

a ee 53 
2M? z 
Tom + 0) 

eo (344 — M))? 

=e 


or 


a? +0(a"). 
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From Section 5.4 the relative pull-in is 


Yr = a) Ser + D tanh (=) 


an 
tanh (2) 


where . 
D = ((a — 1)y — 0) /(e" — 0’). 
Then 
ae ae (cy —-at 1) 
ar a co. — b?/ey 
and 
_a—-a@ +1 
1-D=+— (54) 
since 
a. + b’/e, = 2c. 
Using (49) and (51) we have 
1-—-M _9 


We now obtain the asymptotic formula for y, by substitution into 
the formula for y, the approximations for D, 1 — D and the approxima- 
tion tanh (an/2) ~ an/2 = z with z approximated as in (53). 


_ 201 — M)! 


Ye = aR a + O(a) 


5 (56) 
=e, aa: 1)/71)*? + O(7273). 
APPENDIX C 


Relative Pull-in (Lag Filter, Large 71 and 72) 
Assuming that for a large a 
m1 = 2Ka+L+0(a") (57) 
we obtain from the definition 


2 2 
b=a-7n 
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that 
esta _ L+K aah 
Expanding the square root we obtain 
a ee eS -3 
and 
Dies oa, (59) 
a a 
From (57) since 
a= (re + 1)/2 
we have 
sn ae sae re) (60) 
T2 T2 


From (50) we have 


9-9) 
T2 T2 


and by using (60) we have 


(+ K)(QK = 1) 1 


v= (KK) [1+ Cea Se 


+05") |, 


Using the binomial expansion 


1(L+ K)(2kK —-1) 1 


(= vy = (t= KY] +S ETON VT soe]. ov 


From (49) 


71 
T2 


and using (60), we obtain 


L+k 


T2 


e=a—-K-— 





+ O(72”). (62) 


Then 
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a=ct+(e—b) 


: 7 (63) 
= (a—kKk)+(K -—- K)’?'+0(n_). 
Finally from (58) and (63) 
 _ a/(a — K) 
b ga ae | (64) 
=1+ acer ere + O(a~). 


Letting z = an/2, (31) becomes 
tanh z = [1 + (K’ — K)*/a + 0(a@’)] tanh (1 — K/a + 0(a~))z. (65) 
Using the addition formula for tanh (A + B)z and simplifying, we have 
tanh’z tanh (K/a + O(a”) )z — [(K? - K)*/a + 0(a)] tanh z 
+ [1 + (K? — K)*/a + 0(a”)] tanh (K/a + 0(a™))z = 0. 
We show that z2/a approaches zero with a and use this to simplify (66). 
From (35) | 
2 tanh *[1 + (K*® — K)*/a + 0(a”’)] 
1 — K/a + 0(a*) 


(66) 


2a< 


2a 1 
7 1 — K/a + O(a?) 


Since 


lim In u/u = 0 


uo 
we have 


lim z/a = 0. 


Returning to (66), we now have asymptotically 


2 K — 1\' tanh z 
tanh 7 +( KR ) : 





1 = 0. 
Solving for K we obtain 


2 
— tanh :| Z. (67) 


1 
oe E Z 
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Now the relative pull-in given in Section 5.4 is 


i= 


ie + D tanh z 





Yp 


and we easily show that [using (54)] 


qi -at+l 
2c 


_(K’- K)i-K+1+0(") 
a—- K+ 0(a") 


= O(a’). 


1-D= 





Hence asymptotically for fixed z, 
vp = tanhz + O(a‘). (68) 


Thus for given relative pull-in, the above gives us z and tanh z, and 
then (67) gives K from which (57) gives for large 7, 


m1 = K(r2 +1). (69) 
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Reliability of Components for 
Communication Satellites 


By I. M. ROSS 
(Manuscript received August 15, 1961) 


This article considers the reliability of components such as transistors, 
diodes, and solar cells in relation to the design of a communication satellite 
with adequate reliability. Consideration is given to methods for determining 
the reliability of high-quality components and of techniques for selecting the 
most stable components for this application. It 1s concluded that, at least for 
a simple communication satellite, components can now be obtained that will 
lead to a satisfactory life. 


I. INTRODUCTION 


All the necessary components and circuit techniques are available to 
fabricate a simple communication system using low-orbit satellites.! 
Such a system would use many satellites at an altitude of a few thousand 
miles and be capable of global communications with a few megacycles 
baseband. The ground receiver portion of the system could achieve 
adequate signal-to-noise for very low received power by use of high-gain 
receiving antennas, low-noise maser receivers and FM modulation with 
feedback. The satisfactory performance of this type of receiver was 
demonstrated in the Echo I experiment.? In conjunction with such 
sensitive ground receiver equipment, it is possible to use a satellite re- 
peater putting out only a few watts of power from an isotropic antenna, 
and hence avoiding the additional complexity of attitude stabilization. 
The components needed for such a satellite, including the traveling-wave 
tubes, transistors, diodes and solar cells, are all either available or 
achievable within the capability of existing technology. Thus a com- 
munication satellite system is feasible in principle. Whether or not it is 
economical and therefore practical, depends upon the life expectancy of 
the system, and specifically on the life of the satellite itself. It will be 
assumed here that a satellite life of at least five years is a reasonable 
target in the design of a practical communication system. By the very 
nature of the system, repair of the satellite is presently impossible (and if 
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ever possible, would be exorbitantly expensive), and because of the cost 
penalty of additional weight in orbit, extensive redundancy is most 
undesirable. Thus, the practicality of the system depends critically on the 
reliability of the components that make up the satellite itself. This paper 
is devoted to a discussion of the reliability of components in relation to 
the design of a satellite with adequate reliability. Although the discussion 
is directed specifically to low-orbit (several thousand miles altitude) 
satellites, many of the ideas could apply equally well to higher orbits. 

In Section IT, below, consideration is given to the order of component 
reliability needed in a simple communication satellite. Section III deals 
with the reliability of components in general with emphasis on means for 
attaining highly reliable components and for determining quantitatively 
their degree of reliability. Section IV discusses the level of reliability that 
can be achieved in three critical classes of components, namely transistors 
and diodes, traveling-wave tubes and solar cells. Finally, it is concluded 
that, with careful manufacture and selection, components can be ob- 
tained for a practical communication satellite system. 


II. COMPONENT RELIABILITY REQUIRED FOR COMMUNICATION SATELLITES 


For the consideration of reliability it is convenient to divide the life of a 
satellite into three periods, namely pre-launch, launch, and orbit. It is 
usual practice to assume that any failure that occurs a reasonable time 
prior to lift-off can be corrected by replacement and that, at the worst, 
this could result in some delay in the launch time. For such an assump- 
tion to be valid, it is necessary that components or batches of components 
be accessible and removable so that failed portions of the satellite can be 
replaced. The design for such flexibility does necessitate some weight 
increase. Although the launch period is short, it is accompanied by large 
mechanical stresses liable to cause failure. As will be discussed later, in 
the section on traveling-wave tubes, experience with many launches has 
shown that with well designed components and equipment, failure during 
launch of the electronic equipment in a satellite is not a significant factor 
in the over-all reliability of the satellite. It is the third period, life in 
orbit, which dominates the reliability design of a satellite. In this section 
we consider the relationship between the reliability of components and 
the anticipated life in orbit. 

In calculating the probability of survival of a system containing a 
large number of components, it is frequently assumed that the failure 
distribution of any type of component is exponential. On such an assump- 
tion, the performance of a given type of component can be characterized 
by a mean time to failure or a failure rate. One of the more convenient 
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ways to represent the failure rate is in terms of a number of failures for a 
given number of component operating hours. A method which is in in- 
creasing use defines failure rate as the number of failures per 10° compo- 
nent hours (1 failure per 10° hours corresponds to a failure rate of 0.0001 
per cent per 1000 hours). By way of calibration, a good resistor or 
capacitor has a failure rate in the range 5 to 10 per 10° component hours, 
while an entertainment receiver tube will have a rate in the neighborhood 
of 100,000 per 10° hours. 

If we assume that a given system contains m; components of a given 
type, and that the failure rate for that type is fi per 10° hours, we expect 
statistically that there will be 7, /; failures per 10° hours. Hence in a time ¢ 
hours we expect tf1/10° failures. Assuming that failure probability is 
random and that the failure of any one of these components leads to 
failure of the system — that is, assuming no redundancy — the proba- 
bility P, that the system will not fail in t hours due to failure of one of the 
m components, is given by: 


Ps exp | lt], (1) 


Similarly, if we have a system composed of ni, n2, etc., components of 
types having failure rates f:, fe, etc., and we again assume no redun- 
dancy, the probability P,, of survival for time t is given by: 


Py = exp | ~7h 3 (nahn) (2) 


This simple equation can be used to estimate probability of system’s 
survival, provided that the following conditions are met: 

a) The failure mode of the components is assumed random with 
recognized exceptions being treated separately. 

b) The system contains no redundancy. 

Assumption b) is unrealistic since a certain degree of redundancy will be 
featured in any good design. However, because of weight limitations in a 
satellite, redundancy cannot be used to correct for poor reliability per- 
formance of a majority of the devices. Hence the equation is useful in 
determining desired objectives. 

Table I shows the results of reliability calculations for a hypothetical 
communication satellite. At the left of the table are listed the types and 
numbers of critical components used. These types and numbers, which 
are representative of a very simple repeater of a few megacycles base- 
band, do not include any allowance for redundancy, nor do they include 
allowance for the telemetry invariably associated with such a system. 
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TABLE J] — RELIABILITY CALCULATION FOR SIMPLE COMMUNICATION 











SATELLITE 
Case I Case II Case ITI 
Type of Component Number (n) - aire i: ailure bes 
ae pradiét area Product ie Product 
ures/109| (nf) |ures/10° ) ures/109 (nf) 
hrs) hrs) hrs) 
Transistor 140 20 2800 | 10 1400 | 5 700 
Diodes 161 15 2415 | 10 1600 | 5 805 
Resistor 400 5 2000 |} 5 2000 } 2 800 
Capacitor 250 10 2500 | 5 1250 | 2 500 
Inductor and Transformer 40 20 800 | 15 600 | 5 200 
Relays 6 50 300 | 25 150 | 6 120 
Ni-Cd Cells 20 50 1000 | 25 500 | 15 300 
Totals 1017 11,815 7510 3425 
Average Failure Rate 11.6 7.4 3.4 
Probability of success — 1 0.901 0.94 0.97 
year 
Probability of success — 5 0.60 0.72 0.86 
years 











Excluded from the list is the traveling-wave tube. The unique life proper- 
ties of the single traveling-wave tube in a nonredundant satellite warrant 
special treatment. Also excluded are the solar cells which, as will be 
discussed later, will probably fail through wear-out resulting from radia- 
tion damage and thus cannot be treated with the statistics of equa- 
tion (2). 

The table shows three cases, each assuming somewhat different failure 
rates for the components. For each case the table gives the failure rate f 
assumed for the component, the product of the failure rate times the 
number n of each component, the total sum >_7 (nfm) and the average 
failure rate. Also shown in the table is the probability of success of the 
satellite, i.e., no failure of any component as calculated using (2), for 
one-year operation and for five-year operation. It is seen that case 1 
represents satisfactory performance for one year and poor performance 
for five, while case 3 represents satisfactory performance for five years. 
Case 2 is an intermediate case. Using some judgment as to the relative 
values of failure rates for various components, the failure rates were 
chosen in the three cases to give the above results. Thus the table shows 
what level of component reliability is needed to meet a given systems 
performance. 

It must be emphasized that considerable caution is needed in the 
interpretation of the results shown in Table I. Implicit in the calculations 
are many assumptions, the validity of which could be questioned. The 
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results should therefore be used as a guide to the order of magnitudes of 
reliability required and should not be considered to be precise predictions 
of systems performance. There are, nevertheless, a number of general 
conclusions to be drawn from the table. The first is that although this is a 
fairly simple system — 1000 components — average failure rates in the 
neighborhood of 10 per 10° component hours are required to give any- 
thing approaching economical life. As seen from (2), the life expectancy 
for a given probability of success varies inversely with the average 
failure rate. Thus, an average failure rate in the neighborhood of 100 
would be intolerable, while an average failure rate in the order of 1 would 
permit increased design life and/or complexity. A second conclusion is 
that all the components that are numerous, i.e., all the transistors, - 
diodes, resistors and capacitors, require an equally high order of reliabil- 
ity. This conclusion results directly from forbidding redundancy for the 
high-runner components. A final conclusion is that, at least for the more 
reliable designs, the reliability of connections between components can- 
not be ignored. For the 1000 components of Table I there would be 
several thousand connections and hence, in order that there be an in- 
significant probability of a connection failure, they must have failure 
rates substantially less than 1 per 10° hours. Although there is little 
quantitative information regarding reliability of connections, it is 
believed that those liable to fail are eliminated during the vibration, 
temperature cycle, and vacuum tests normally carried out as part of the 
acceptance test of a complete satellite. 


Ill. RELIABILITY OF COMPONENTS 


Fig. 1 shows a possible failure pattern for a batch of components. 
Such a curve could be obtained by taking a large number of new compo- 
nents of a specific type, operating them under typical conditions, and 
plotting the failure rate versus time for the batch. The distribution has 
two regions of relatively high failure rate, one early in life and attrib- 
utable to “manufacturing freaks,” one later in life attributable to ‘‘wear- 
out,” separated by a region of low failure rate labeled ‘‘random failure.” 
These three regions will be discussed separately. 


3.1 Wear-Out Failure 


In some manufactured products there is a mechanism or a collection of 
mechanisms which systematically reduces the useful performance of the 
product until a point is reached at which it has no further utility and is 
‘worn out.” Typical examples of wear-out mechanisms are friction of 
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Fig. 1 — Possible failure distribution for a large number of new components. 


bearings, corrosion of relay contacts, and deactivation of electron tube 
cathodes. If, for a given batch of components, conditions were identical 
during fabrication and use, then all components would fail in response 
to wear-out simultaneously. However, because conditions are not 
identical, simultaneous failure does not occur, and the failure distribution 
is characterized by a peak of finite width. Region III in Fig. 1 shows the 
onset of wear-out. Once wear-out failure commences, the failure rate of 
the batch of components increases vary rapidly, and effectively al! 
components of that type must be replaced. In systems such as satellites, 
where replacement is not possible, the time at which wear-out becomes 
significant should be greater than the designed life of the satellite. 
Lengthening of the time to wear-out can only be achieved by under- 
standing the wear-out mechanisms and by designing the components 
either to minimize or eliminate these effects. 


3.2 Manufacturing Freak Failure 


There is a certain percentage, preferably small, of any product that 
fails unusually early in life because of some defect in manufacture. 
These are, in a sense, objects that were not made according to the design. 
Tor example, such early failures can occur both in tubes and semicon- 
ductor devices as a result of defective seals or of the presence of particles 
inside the encapsulations. The prevalence of manufacturing freaks can 
be reduced drastically by quality control in manufacture. Remaining 
freaks can usually be detected and rejected by rigorous pre-aging tests, 
such as leak tests, vibration and shock tests. In addition, the product can 
be aged for a period longer than that corresponding to Region I, so that 
the remaining freaks will fail during this ‘‘pre-age period.” 
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3.3 Random Failure 


Even in a well designed and well manufactured product there may be 
a substantial period, after that exhibiting high failure rate due to manu- 
facturing freaks and before wear-out occurs, of a continuing failure 
rate. These failures include components which, through presumably 
detectable causes, fail in response to manufacturing weaknesses much 
later than the majority of freaks, and others which fail through similar 
causes to, but earlier than, the wear-out failures. The failures that occur 
during this period may generally be attributable to a large number of 
different causes, each of which occurs so rarely that it would be exorbi- 
tantly expensive to identify all of them. This period is in essence the 
useful life of the product. If the frequency of such failures is sufficiently 
low, as indicated, these may be essentially below the noise level of 
identification of mechanisms, and a random failure mechanism, and 
hence a constant failure rate, may be assumed. Although there may be 
considerable doubt as to the validity of this assumption for some com- 
ponents, it has proved useful in the estimation of over-all systems relia- 
bility. 

Fig. 2 summarizes the steps that can be taken to cope with the vari- 
ous modes of failure shown in Fig. 1. The region of high failure rate 
corresponding to wear-out can be moved further out in time ‘by design 
based upon knowledge of the failure mechanisms. The number of devices 
subject to early failure through manufacturing freaks can be reduced 
by quality control, rejected after testing and annihilated by pre-aging. 
Hence, provided sufficient care is taken, it is possible to obtain a prod- 
uct which, during the intended life of the system, will exhibit substan- 
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Fig. 2 — Summary of steps that can be taken to reduce failures of various 
types. 


642 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


tially only a low failure rate corresponding to Region II. This failure 
rate can be determined from the results of extensive life tests involving, 
for the most reliable components, thousands of devices for thousands 
of hours. 

The low failure rate of Region II is that characteristic of the product. 
Where reliability is of supreme importance, it is desirable to select 
from the product as a whole those components that exhibit the greatest 
degree of stability. This can be achieved by putting on life test a num- 
ber of components many times that needed in the system, and after a 
given length of time selecting from the batch only those components 
which have shown the minimum change in their parameters. The dura- 
tion of the life test prior to selection will depend upon a number of 
factors, including the life required in the system and the system’s 
schedule which, itself, frequently limits the life-test period. In the selec- 
tion of submarine cable tubes, a period of seven months is used. Al- 
though it is expected that the selected product will have a lower failure 
rate than the batch from which it was selected, it is difficult, if not 
impossible, to estimate the degree of this improvement. The consensus, 
however, is that a factor of 10-100 improvement could be achieved. 

In order to achieve the reliability potential of a carefully designed 
and manufactured component, it is essential that the same care go into 
the design and assembly of circuits and subsystems. Circuits must be 
designed with adequate margins, and power dissipations must be deter- 
mined so that temperatures do not reach values at which reliability of 
the components is no longer adequate. Assembly procedures should be 
arranged to avoid excessive mechanical or thermal shock. The conserva- 
tive use of a component is thus an important part of the achievement 
of reliability. 


IV. RELIABILITY OF SPECIFIC COMPONENTS 


The components that appear in large number in a typical satellite 
and require reliabilities corresponding to 10 failures per 10° hours, in- 
clude transistors, diodes, resistors and capacitors. Passive components, 
resistors and capacitors, have for many years been available with relia- 
bility in this range. However, until recently such low failure rates had 
not been achieved in the active components. For this reason the discus- 
sion in Section 4.1 below is restricted to transistors and diodes. 

The traveling-wave tube used to generate the output power in most 
communication satellite designs does not require the high degree of 
statistical reliability called for in transistors and diodes. However, it is 
required to operate without failure for a period much longer than the 
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life of ordinary tubes and also to withstand severe mechanical stress 
during launch. The expected performance of satellite tubes is discussed 
in Section 4.2 below. 

The solar cells, although as numerous as the transistors and diodes, 
are expected to fail due to ‘“‘wear-out”’ from radiation damage. The ex- 
pected life of these components is discussed in Section 4.3. 


4.1 Transistors and Diodes 


As indicated previously, the reliability of a component in the final 
analysis is limited both by the design of the component and the care 
with which it is manufactured. The attention to design and manufacture 
is particularly important in the case of transistors and diodes which are 
both delicate and particularly sensitive to contamination, yet are re- 
quired to exhibit failure rates comparable to those of the more rugged, 
passive components. Mechanical techniques have been developed 
whereby small semiconductor wafers can be bonded to headers and 
even smaller leads connected between the wafers and the headers, such 
that the resulting structure will easily withstand the mechanical shock 
and vibration experienced during the launch of a satellite and the tem- 
perature cycling that may be experienced while in orbit. Final cleaning 
and sealing techniques have also been developed which insure a degree 
of initial cleanliness and subsequent protection from outside contamina- 
tion, such that adequate reliability for satellite applications can be 
achieved. 

Table II outlines the complete reliability testing program proposed 
by Bell Laboratories for providing transistors and diodes for satellite 
applications. The first step is to insure that the design itself has ade- 
quate reliability potential. In order for a design to qualify for satellite 
use, it must pass mechanical tests which represent conditions more 
rugged than will be experienced during launch. The devices are further 
subjected to electron and proton bombardment simulating many years 
exposure to Van Allen radiation. Iinally, devices are subjected to relia- 
bility evaluation to determine the reliability potential of the design. 

The second step, that of screening and pre-aging, is designed to elimi- 
nate those few remaining freaks that were not eliminated by quality 
control. These tests include mechanical shock and vibration tests to 
eliminate weak components. In the reliability portion of these tests, a 
sample from the particular manufacturing lot is tested at increasing 
temperatures until all devices in the sample have failed. The median 
temperature for failure and the distribution of failures with temperature, 
when compared with similar figures for previous manufactured lots, 
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TABLE IJ —RE.LIABILITY PROGRAM FOR SATELLITE TRANSISTORS 
AND DIODES 





1. Design Qualification Tests 





Mechanical 
Temperature cycling —65C to +85C 
(—120C to +40C for blocking diodes) 
Temperature-humidity cycling 


Shock ; g 
Centrifuge 5,000-10,000 g 
Vibration 60g, 100-2,000 cycles 
Radiation 
Reliability 


Accelerated aging 
Life testing 
Field experience 





2. Screening and Pre-aging 





Mechanical 
Centrifuge 2,000 g 
Temperature-humidity cycle 
Tap or shock 
Reliability 
Accelerated temperature sample 
High-temperature aging 


3. Life Test and Selection 


Reliability 
System simulation and selection 


indicate whether or not there are major differences from previous lots. 
In addition, all the devices that may be used in satellites are subjected 
to a short period of high temperature aging. Since, as discussed later, 
aging is accelerated by raising temperature, this pre-age eliminates 
many devices that otherwise would have exhibited unusually carly 
failure. 

The third step consists of choosing from the components that have 
passed step two, a number many times greater than the number that 
are finally to be used, and putting them on life test for six months under 
power and temperature conditions simulating those anticipated in 
operation. The duration of this test, which ideally should be a sub- 
stantial fraction of the design life of the system, is frequently limited by 
economic factors or by the time available prior to the system’s opera- 
tion. During the life-test period, the characteristics of the components 
are measured at frequent intervals. The components needed for the 
system are chosen on the basis of their performance during the life-test 
period. If proper choices have been made, the components used should 
be ones which have shown no change in characteristics. 


COMPONENTS FOR SATELLITES 645 


Steps 2 and 8 in this program are intended to insure that the com- 
ponents selected are truly representative of the design and do not in- 
clude any freaks. Assuming these steps to be successful, the most sig- 
nificant portion of the program in determining system performance is 
the evaluation in step 1 of the reliability potential of the product. Since 
the reliability required is in the neighborhood of a few failures per 10° 
hours, this reliability evaluation can involve tens of thousands of com- 
ponents for tens of thousands of hours. It is with the object of reducing 
the numbers and times involved that considerable emphasis has been 
put on the development of accelerated aging techniques.*4'> The results 
of a typical accelerated aging experiment are shown in Jig. 3. Plotted 
in the figure is the median life of a germanium transistor as a function 
of the temperature at which the transistor is operated. The data shown 
as solid points were obtained for some germanium transistors manu- 
factured by the Western Electric Company in 1958. The temperatures 
at which the transistors were tested range from 100°C to as high as 
as 350°C, while the range in time to median failure is from about 20 
minutes to just over 1 year, nearly 5 decades. The fact that the points 
fit a straight line on a 1/7 versus log time plot suggests that raising 
the temperature is accelerating some failure mode which can be charac- 
terized by an activation energy. It has been found that within experi- 
mental error, the apparent activation energy is the same for all ger- 
manium transistors and, in addition, that there is a single but slightly 
different activation energy for all silicon transistors and diodes. The 
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Fig. 3 — Results of a typical accelerated aging experiment on germanium 
transistors. 
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Fig. 4 — Failure rate vs temperature for germanium transistors. 


triangles in Fig. 3 are for transistors manufactured by the Western 
Electric Company more recently. It is apparent that substantial im- 
provements have been made at least in the high-temperature perform- 
ance of the product. The data in Fig. 3 are for the median life. In per- 
forming the accelerated aging experiments, one also obtains the dis- 
tribution of failures in time for a given temperature or, alternatively, 
in temperature for a given time. It is found that these distributions 
have the same shape, i.e., log normal in time* and normal in tempera- 
ture, for all transistors and diodes. The widths of the distributions do 
not change with temperature for a given device type, that is, for fixed 
design and manufacturing procedure. This uniformity of failure distribu- 
tion gives further confidence that raising temperature is accelerating a 
failure mode characteristic of the product. 

Knowing the variation of median life with temperature and the dis- 
tribution of failures in time for a fixed temperature, it is possible to 
derive a more useful plot for the systems designer, that of failure rate 
against temperature as shown in Fig. 4. The points are for the older 
transistors from the previous figure. A straight line is observed in the 
plot of 1/T against log failure rate. Extrapolating the line to room tem- 
perature, one would predict a failure rate of 10 per 10° hours for these 
transistors. The prediction of a failure rate of 10 per 10° hours from 
the acceleration curve of Fig. 4 is, however, liable to be optimistic be- 
cause there is no guarantee that the curve does not dip below the straight 
line for times greater than the longest at which a measurement was 
made. There is no guarantee that in raising the temperature we are 


* This is an example of a component that in the region of low failure rate does 
not exhibit the exponential failure distribution usually assumed. 
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accelerating all the failure mechanisms or even a guarantee that we are 
accelerating the most important failure mechanism at operating tem- 
peratures. For example, although one might expect that raising the 
temperature would increase the rate of reaction between the germanium 
surface and any water vapor inside the transistor can, one has no rea- 
son to suspect that elevated temperature would affect the occurrence 
of a short-circuit caused by a metal chip falling between emitter and 
base contact. 

The accelerated aging curve, when extrapolated to room tempera- 
ture, indicates the potential reliability of the design, and in the final 
analysis one must depend upon laboratory tests or field experience 
under operating conditions. The triangle on Fig. 4 shows the failure 
rate observed in the field trial of a new system using about 40,000 of 
these same transistors for about 10,000 hours. It is encouraging that 
the failure rate is only a factor of about 2 higher than that predicted 
from accelerated aging, and particularly so since the system failure 
rate includes failures due to mishandling and is for devices which were 
subjected to no special selection. It is therefore reasonable to estimate 
that the failure rate for these older germanium transistors, when prop- 
erly handled and selected in a manner proposed for satellite use, would 
lie somewhere in the neighborhood of 10 to 20 per 10° hours. 

The line through the squares in Fig. 4 is the accelerated aging curve 
for the more recent Western Electric product. Note again that there is 
a substantial improvement. The accelerated aging curve for recent sili- 
con transistors and silicon diodes does not differ significantly from that 
for germanium transistors. With such an improvement in the reliability 
potential of the product, and with careful pre-aging and selection, one 
is confident that failure rates substantially lower than 10 per 10° hours 
are now achievable and that they may well be lower than 1 per 10° 
hours. However, complete confirmation of this prediction will have to 
await results of field trials. 

The acceleration curves serve to emphasize the importance of con- 
servative circuit design in the achievement of high reliability. It is seen 
from the slope of the curves that failure rate increases very rapidly 
with temperature. It is therefore important that power dissipation in the 
device be maintained sufficiently low that temperature rise above am- 
bient does not impair reliability. It is equally important that the am- 
bient temperature be maintained at a suitably low value. 


4.2 Traveling-Wave Tubes 


Fig. 5 is a photograph of the traveling-wave amplifier under develop- 
ment at Bell Telephone Laboratories for use in experimental communi- 
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Fig. 5 — Traveling-wave amplifier under development for satellite use. 


cation satellites. Table III lists the more important characteristics of 
this tube. Before discussing the performance and reliability of the 
M4041 satellite traveling-wave tube, a few words are in order on the 
reasons for selecting traveling-wave tubes to provide the output power 
in the satellite. It would appear that if a solid-state device could pro- 
duce several watts at a few thousand megacycles, it would be, because 
of its small weight and potential reliability, an obvious choice over the 
traveling-wave tube. To date, however, schemes for generating power 
at several thousand megacycles using solid-state devices — harmonic 
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Tasie I]]—Sareviire Tusr Cuaracteristics M4041 (7/7/61) 





Operating point 0 dbm input satu- 
rated output 

Output power (minimum) 3.5 W 
Gain (at saturation) 35.5 db 
Gain (low level) 41 db 
Anode voltage 1770 volts 
Helix voltage 1540 volts 
Collector voltage 740 volts 
Cathode current 17.0 ma 
Cathode current density 85 ma/em? 
Collector power (includ- 12.5.w 

ing helix and anode) 
Heater power 1.5w 
Weight 7.1 Ibs. 





generators, for example — operate at efficiencies very much lower than 
that of a traveling-wave tube, even when heater power is included. The 
weight of the additional solar cells needed to provide power for the 
solid-state device would more than offset the decrease in weight from 
that of a traveling-wave tube. The weight penalty for extra power is 
particularly severe for satellites subject to Van Allen radiation, where 
account must be taken not only of the weight of the solar cells and 
their mounting but also of the necessary protective covers. The higher 
gain of the traveling-wave tube gives it a distinct advantage over other 
tubes such as triodes, which would require at least two stages and, 
through consequent loss of efficiency, lead again to greater over-all 
weight. The high efficiency of the traveling-wave tube results from the 
distinct separation between the microwave interaction region and the 
beam formation and collection regions. After the microwave interaction 
takes place, the beam is allowed to enter a region of retarding field, where 
the beam is slowed before collection. This is usually done by depressing 
the collector voltage below that of the helix, as shown in Fig. 6. Since 
very little current is intercepted on the helix and the anode, the input 
power is very nearly proportional to the collector voltage. By depressing 
the collector voltage, efficiencies as high as 39 per cent have been 
achieved and 36 per cent is typical. When the power required by the 
cathode heater is included, this value falls to typical value of 31 per 
cent. A second effect of collector depression is that ions generated be- 
tween the anode and the collector will flow to the collector and not to 
the cathode. This results in a substantial decrease in the possible ion 
current bombarding and consequently damaging the cathode. 

The traveling-wave amplifier for a satellite must be a new design in 
order optimally to meet the specific needs of the system. With any 
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Fig. 6 — Traveling-wave tube circuit with depressed collector. 


reasonable time scale, it is not possible to carry out a long-term evalua- 
tion of tube life, nor is it possible to do shorter experiments on very 
large numbers of models as is done with semiconductor devices. It is 
therefore necessary from the viewpoint of reliability to employ a design 
closely derived from experience gained with previous tubes and to 
utilize a ‘pedigree’ approach in the assembly process. These earlier 
tubes include the pentodes used in telephone submarine cables,® the 
traveling-wave tubes used for microwave transmission at 6 kmc? and 
the rocket-borne traveling-wave tube used in a Bell Telephone Labora- 
tories missile guidance system.’ The salient features of these tubes are 
discussed in the next few paragraphs. 

The submarine cable tube, the 175HQ, was the first tube designed 
to meet long-life reliability requirements somewhat similar to those 
encountered in satellite work. The failure pattern for this tube was 
found to agree with that shown in Fig. 1. The dominant wear-out 
mechanism in this case was determined to be the deactivation of the 
cathode, an effect which increases rapidly with increasing cathode tem- 
perature. Design information was developed which permitted the choice 
of a cathode temperature low enough to insure the desired life of the 
tube. The techniques of quality control to eliminate manufacturing 
freaks, and of life test and selection to insure the minimum random 
failure rate, were used extensively on this tube. As a result, the tubes 
that have been manufactured and put into operation in submarine 
cables easily meet the systems requirements. For example, Fig. 7 shows 
the accumulated tube life of the tubes in operation to date in submarine 
cables. There are now over 1600 tubes in such operation, some for as 
long as five years, with an accumulated life of 49 million tube-hours and 
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no failures. It is on the basis of this evidence that it is believed possible 
to make long-life tubes and, in particular, to eliminate failure due to 
cathode deactivation. 

The second tube of interest is a 6 kme traveling-wave tube used as a 
ground-based microwave repeater, the M1789, now the WECo 444A. 
This traveling-wave tube was the first designed by Bell Telephone 
Laboratories specifically for long life, and it used many of the design 
principles and many of the selection techniques developed for sub- 
marine cable tubes. This tube also was designed to operate with a de- 
pressed collector. A little over four years ago, twelve of these tubes 
were placed on life test at their normal operating power of 5 watts. 
Table IV shows the accumulated hours on each of these tubes as of 
May, 1961, at which time there had been no tube failures. On the basis 
of this experience and the fact that the satellite traveling-wave tube 
has been designed to have a substantially lower cathode loading and 
cathode temperature than the 6 kme tube, the satellite tube has an 
expectation of a life considerably in excess of four years. 

The third tube is a traveling-wave tube designed to operate in the 
Bell Telephone Laboratories Command Guidance System, the M1958, 
now the 7116. In this system, the rocket to be guided contains a re- 
ceiver, decoder and transmitter. There is a component count approxi- 
mating 1000, including one traveling-wave tube. This system has been 
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Fig. 7 — Operational life of electron tubes in undersea cable system repeaters. 
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- TaBLE IV—M1789 TraveLtinc-WAvE TUBE Lirr TEST 








Tube Number ee Hours 
BC-856 39502 
BC-1342 39630 
BC-1363 39319 
BD-14 39256 
BD-660 39401 
BH-69 39256 
BH-208 33994 
BH-413 37813 
BH-464 36840 
BH-559 35394 
BS-41 36615 
BS-102 34959 





used in the guidance for about one-third of the U.S. satellites now in 
orbit. It was used, for example, with Echo I and with the three Tiros 
satellites. There have been to date over fifty successive firings using 
this guidance package with no failure. Since the guidance system needs 
only to operate for a few minutes, it gives us little information on long 
term reliability. However, since it not only must survive launch but 
must also operate during launch, this performance is a very potent 
demonstration that traveling-wave tubes can be made rugged enough 
to withstand the strains of launch. It further demonstrates that an 
electronic system containing roughly the number and kind of compo- 
nents needed in an active satellite can also survive launch. 

To summarize, then, it is known from experience with the submarine 
cable tube and with the microwave relay tube that traveling-wave 
tubes can be designed with a life expectancy considerably in excess of 
four years. The performance of the guidance tube demonstrates that 
techniques are available for making a traveling-wave tube sufficiently 
rugged to withstand launch. 


4.3 Solar Cells 


Communication satellite designs for the immediate future rely on 
silicon solar cells as the prime source of power. These cells will be sub- 
ject to radiation in the Van Allen belt,® which consists of electrons with 
substantial densities at energies up to 1 mev and protons at energies 
as high as 100 mev. Fig. 8 is a map of the Van Allen belt on a plane con- 
taining the earth’s magnetic axis. There is a peak in the electron intensity 
at an altitude of about 2000 miles, and a second peak at about 10,000 
miles with a substantial density of electrons at intermediate altitudes. 
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Fig. 8 — Map of Van Allen radiation belt in plane through earth’s axis. 


The protons, which are much less numerous, have a distribution which 
also peaks at around 2000 miles and falls off in some undetermined 
manner to negligible values beyond 10,000 miles. Bombardment of 
solar cells with particles of such energy results in a continual decrease 
of power output with time, at such a rate that this degradation could 
result in the failure of the power supply within the desired life of the 
satellite. Here then is an example of probable failure due to wear-out, 
in which case it is particularly important both to understand the mecha- 
nism of wear-out and to design the devices to minimize the effect. In 
this section, we discuss the effects of Van Allen belt radiation on solar 
cells, the means of designing cells to minimize the effects, and the pre- 
dicted performance of such specially designed cells. 
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As shown in Fig. 9, a solar cell typically consists of a slice of n-type 
silicon with a thin p-type layer on one surface and contacts made to 
both surfaces. When light falls on the p-type surface, the photons pene- 
trate the silicon to depths dependent upon their wavelengths and are 
absorbed with the creation of free carriers, hole-electron pairs, in the 
silicon. The free carriers created in response to the longer wavelength 
light are created deeper in the material. Some of the carriers move to 
the junction, and in crossing the junction create a current flow in the 
external circuit. Thus an illuminated solar cell is a source of electric 
power and has a voltage-current characteristic typically as shown in 
the figure. 

In discussing the optimum design of solar cells, it is convenient to 
divide the generated carriers into two classes, namely those that are 
generated in the body of the material beneath the pn junction, and 
those that are generated in the surface layer above the pn junction. 
Those generated beneath the junction will reach it only if they are 
generated within a distance called the diffusion length, that is, the 
distance that generated carriers may move in the material before being 
annihilated by recombination. The diffusion length is a property of a 








=> 


Fig. 9 — Solar cell construction and typical voltage-current characteristic. 
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particular material and depends critically upon its perfection and purity. 
For a solar cell to have the maximum efficiency, this diffusion length 
should be as long as possible in order that effectively all carriers gen- 
erated beneath the junction may reach the junction and contribute to 
the output current. A somewhat different situation exists for the carriers 
generated in the surface layer. This layer is usually quite thin compared 
to a diffusion length. However, the surface of the semiconductor acts 
as a sink for carriers and thus competes with the junction for carrier 
collection. The net result is that the efficiency for collection of carriers 
generated above the junction is less than that for carriers generated 
below the junction. It is therefore desirable to minimize the thickness of 
the surface layer. 

The perfect solar cell therefore would have a zero thickness of surface 
layer and an infinite diffusion length. A zero thickness surface layer, 
however, would lead to infinite series resistance. Obviously a compro- 
mise is necessary. Fig. 10 shows the distribution of carriers generated 
in silicon in response to sunlight. The plot gives the percentage of car- 
riers generated beyond the value of the abscissa. It is seen that about 
75 per cent of the carriers are generated below | micron depth, and 
that for a junction depth about 4+ micron, essentially all the carriers 
are generated below the junction. 

When high-energy electrons or high-energy protons are incident on a 
silicon solar cell, they create local disorder in the crystal which results 
in a steady decrease of diffusion length with time. A simple theory for 
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Fig. 10 — Distribution of free carriers generated in silicon in response to sun- 
light. 


656 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


the degradation of diffusion length predicts that the diffusion length DL 
should depend on the total flux & of electrons or protons according to 
the equation: 

1 1 


L? = Lee AP (3) 





where J is the value of the diffusion length before irradiation and K is 
a constant for a given energy of particle and for a given semiconductor. 
Hence, for large enough radiation fluxes, the diffusion length is in- 
versely proportional to the square root of the flux. Fig. 11 shows a plot 
of diffusion length versus flux of 1 mev electrons. The experimental 
points were obtained by measuring the diffusion length in silicon after 
successive exposure to 1 mev electrons from a Van de Graaff generator. 
The line on Fig. 11 is a two-parameter fit of (3) to the experimental 
data. Similar results are obtained for proton bombardment. 

As the diffusion length in a solar cell decreases with exposure to 
radiation, fewer and fewer of the carriers generated deep in the silicon 
are collected at the junction. Thus, the power output of the solar cell 
decreases. Since, as pointed out earlier, the depth of generation increases 
with the wavelength of light, the solar cell degrades initially by loss of 
response to the longer wavelength, i.e., the red light. This fact has a 
number of implications for the design of solar cells for use in the Van 
Allen belt. Firstly, since it is the blue response that is likely to be main- 
tained, and this response involves the carriers generated closest to the 
surface, it is most important for satellite solar cells that the junction 
depth be minimized. Secondly, it is important that any antireflective 


2 


3 
ry) 


DIFFUSION LENGTH, {££ 





2 Be) 


2 5 2 5 


10'4 10'S 
1Mev ELECTRON FLUX, CM72 


10'S 


Fig. 11 — Diffusion length vs flux of 1 mev electrons. 
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coating be optimized for blue light, not for red. Initial good response to 
red light, which calls for long diffusion length, becomes of lesser im- 
portance. 

It has been found by several investigators that the decrease of diffu- 
sion length in response to electron and proton bombardment is less 
rapid in p-type silicon than it is in n-type silicon.!° For this reason, cells 
for satellite use are preferably made with a thin n-skin on a p-type 
body rather than the other way around. lig. 12 is a schematic diagram 
of a solar cell designed at, Bell Telephone Laboratories and incorporat- 
ing the features just discussed.!! It is made on a p-type silicon body 
with an n-layer } micron thick. In order to produce such a thin layer 
with good properties, it is necessary to minimize surface damage. Jor 
this reason the surface used is given an optical polish. Such a thin layer 
tends to have high sheet resistance and calls for many contact fingers 
to minimize the effect of series resistance. Finally, the cell is given an 
antireflection coating of thickness designed to optimize the response to 
blue light. 

Having designed a cell to minimize the effects of radiation damage, - 
it is then necessary to consider what, if anything, can be done to shield 
the cells from the radiation. In the case of electrons, substantially all of 
which have energies of less than 1 mev, such shielding is practical using 
materials like quartz or sapphire. Fig. 13 shows the measured degrada- 
tion of the short-circuit current of variously shielded solar cells after 
electron bombardment corresponding to increasing time in the Van 
Allen belt. The shield thicknesses are represented as g/cm?. It is seen 
that over the range for which the measurements were made — which 
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Fig. 12 — Structure of Bell Laboratories solar cell for satellite use. 
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Fig. 13 — Solar cells with various shielding: measured degradation of short- 
circuit current after electron bombardment. 


was equivalent to two years in the Van Allen belt — the effect of elec- 
trons was eliminated by the use of 0.3 g/cm? of shielding. Shielding of 
protons, which are much more energetic, would require intolerable 
weights of material. However, the 0.3 g/cm?, which eliminates the elec- 
tron damage, does provide some reduction in the proton damage. 

Tig. 14 is a plot of the anticipated power output of the solar cells 
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Fig. 14 — Anticipated power output of solar cells as function of time in Van 
Allen belt; with present data, error factor may be as great as 3 in time. 
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shown in Jig. 12 as a function of months in the heart of the Van Allen 
belt. The curves were obtained by estimating the densities and energy 
distributions of electrons and protons in the Van Allen belt and subject- 
ing the cells to electron and proton bombardments simulating Van Allen 
conditions. There may be considerable errors in the estimation of Van 
Allen radiation and, as a result, the time to a given degradation may 
well be in error by a factor as great as 3. It should further be noted 
that the curves have been calculated for the case of a satellite that 
spends all its time in the Van Allen belt, and this is certainly pessimistic. 
A satellite in a circular polar orbit, for example, would spend approxi- 
mately % of the time in the Van Allen belt. 

The most significant feature of the curves in Fig. 14 is that the plot 
of power output per solar cell versus log time is approximately linear 
after initial degradation. This dependence is consistent with the antici- 
pated variation of diffusion length with flux, Fig. 11, and the distribu- 
tion of carriers generated in the silicon, Fig. 10. The degradation with 
time becomes progressively less severe at longer times. Thus, for the 
case of 0.3 g/cm? protection, the output after 10 months has dropped 
from an initial value of 24 mw to about 16 mw while at the end of 100 
months it has dropped further only to 11 mw. This additional decrease 
in power output for a factor of 10 increase in time could be compen- 
sated for by a 50 per cent increase in the number of solar cells. It appears 
then that provided there has been no gross underestimate of the nature 
and effect of the Van Allen belt radiation, solar cell power can be pro- 
vided for a design life of five years and that the design life could be in- 
creascd without excessive penalty. The curves also illustrate the design 
choices that can be made in selecting the mass of front protection. It 
is seen that for a given power output per cell, a factor of 3 increase in 
weight of protection yields about a factor of 5 improvement in life. 
However, the same improved life for a given power output could be 
achieved by retaining the lighter front protection but increasing the 
number of cells by 30 per cent. Just which is the best design of front 
protection thickness will depend on the particular satellite under con- 
sideration. For the case of the experimental satellite being designed at 
Bell Telephone Laboratories, a front protection consisting of 0.3 g/cm? 
of sapphire was found to be the best choice. Fig. 15 is a photograph of 
some solar cell modules with and without the sapphire protection. 

The solar cell is yet another example of a component which can give 
adequate life performance only if the component is properly designed 
and used conservatively. In this case, conservative use involves paying 
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Fig. 15 — Photograph of solar cells, without protection (center) and with 
sapphire shields. 


the weight penalty of sufficient radiation protection and increasing the 
number of solar cells to allow for some inevitable loss of power output 
per cell in response to radiation. 


V. CONCLUSIONS 


Returning to Table I, it is seen that the failure rate of 20 per 10° 
hours chosen for transistors in case I is probably a conservative figure. 
This degree of reliability has already been observed in the field on older 
devices that did not have the benefit of more recent design improve- 
ments and that were not life tested, selected and carefully handled as de- 
vices would be for satellite use. With proper selection and handling 
care, these older devices would almost certainly meet the requirements 
for case II and possibly for case III. The results of accelerated aging of 
the newer product lead to predictions of at least one order of magnitude 
improvement in transistor reliability. Assuming that at least some of 
this improvement will be realized under operating conditions, one ex- 
pects that transistor performance is adequate for case III. The relia- 
bility of diodes, which approximates that for transistors, is similarly 
adequate for case III. Should transistor and diode failure rates indeed 
turn out to be in the region of one per 10° hours, then more complex 
satellites could be designed with life expectancy much longer than five 
years. 
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It further appears that traveling-wave tubes can be made that will 
survive launch and should not limit the life in orbit. Finally, even under 
the most pessimistic assumptions as to the nature of the Van Allen 
belt, solar cell power plants can be provided, at a weight penalty, to 
meet the required life. More precise design of solar cell power supplies 
will only be possible when more precise and extensive data are availa- 
ble on the nature of the Van Allen belt. 

Adequately reliable communication satellites can therefore be made, 
provided they incorporate components of proven integrity which are 
used in a conservative design. The use of components of proven in- 
tegrity involves expense for high-quality design, careful manufacture 
and painstaking selection. The use of such components does not permit 
the performance advantages that might be gained with use of develop- 
mental components. In the final analysis, conservative design leads to 
more weight per given function. Typical examples are the increased 
weight of a rugged traveling-wave tube, the weight of solar cell protec- 
tive covers, the weight of additional solar cells to allow for the inevitable 
degradation in the Van Allen belt, and the additional weight of cir- 
cuitry designed with ample margins. 

Hence, limitations of weight in orbit and requirements of long life in 
orbit both result in a limit on the complexity of the satellite. Communi- 
cation satellites in the immediate future must be simple. As higher com- 
ponent reliability is demonstrated and as improved vehicles permit 
greater payloads, so can the complexity of the satellites increase. 
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Automatic Stereoscopic Presentation of 
Functions of Two Variables 


By BELA JULESZ and JOAN E. MILLER 
(Manuscript received September 21, 1961) 


Spatial models of functions of two variables are often a valuable research 
tool. Nomograms and artistic relief drawings in two dimensions are diffi- 
cult to prepare and still lack the direct impact of a spatial object. It has been 
demonstrated (see Ref. 2) that objects with a randomly dotted surface permit 
the determination of binocular parallax and, thus, can be seen in depth even 
though they are devoid of all other depth cues. This random surface presenta- 
tion has the advantage that the random brightness points can be evenly and 
densely placed, whereas the classical contour-line projection at equally 
spaced heighis may leave empty spaces between adjacent contour-lines. A 
digital computer is used to generate the three-dimensional tmage of a given 
z= f (a, y) function and to wrap its surface with points of random bright- 
ness. The stereo projections of the function are obtained and, when viewed 
stereoscopically, give the impression of the three-dimensional object as being 
viewed along the z-axis. The random surface prevents the accumulation of 
clusters of uniform regions or periodic patterns which yield ambiguities 
when fused. Two stereo demonstrations are given of surfaces obtained by 
this method. 


I. INTRODUCTION 


Pictorial representations and visual displays are invaluable aids in 
conveying scientific or technical information. In particular, the problem 
of presenting three-dimensional data is of interest both from the stand- 
point of its wide range of applicability and the difficulty involved in the 
production of such representations. 

The methods usually employed to present functions of two variables 
in the fields of applied mathematics, engineering, cartography, etc., fall 
into two categories: 1) two-dimensional and 2) three-dimensional dis- 
plays. The first has the obvious advantage of being suitable for the 
printed page, thus permitting a wide circulation for the information so 


663 


664 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


presented. The techniques of nomography, orthography, isarithmic 
(contour line) representations (see Fig. 1) and relief drawings (see Tig. 
2) belong to this category and are widely used despite the expense and 
difficulty in their preparation. However, the greatest objection is perhaps 
the failure of such displays to match the capabilities of human observers, 
who are equipped to perceive a three-dimensional object in depth. The 
second category — that of spatial models or sculpture — answers this 
objection, but these models are usually much too difficult to execute and 
much too limited in their applicability. 

There is, therefore, a need for a technique which a) eliminates the 
tedious effort required of draftsmen in producing such displays, b) 
presents displays complete with the spatial effects inherently belonging 
to three-dimensional objects and appreciated by human observers, and 
c) generates displays suitable for the printed page. This first requirement 
has already been met for two-dimensional representations by the de- 





Fig. 1 — Isarithmic (contour-line) drawing (Example 1°. 
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Fig. 2 — Relief drawing (Example 1). 


velopment of oscilloscopic displays which automatically project onto 
the screen of the oscilloscope the object surface defined by one dependent 
voltage and two independent voltages.! The second and third require- 
ment, however, seem of particular interest, and therefore this paper 
discusses a method employing a computer to make stereoscopic presenta- 
tions of functions of two variables. 


Il. METHOD 


The technique to be described here may be outlined as follows: the 
three-dimensional image of a given function z = f(a,y), which is sup- 
plied as a table of corresponding x, y and z values, is stored in a digital 
computer. The computer is programmed to generate a stereo picture 
pair which, when fused, gives the subjective impression of the threc- 
dimensional object as being viewed along the z-axis perpendicular to the 
base plane of 2 and y. This procedure for obtaining the stereo projections 
of an object can be considered in three ‘parts: 1) defining the function to 
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be presented as a three-dimensional object, 2) ‘“‘wrapping” the object 
with a textured surface, and 3) generating a stereo pair by taking proper 
projections of the object. 

In practice, the variables x, y and z must be evenly sampled with a 
given resolution. Therefore, the object can be defined only by approxi- 
mation, and the various approximations differ in their fine structure. 
The classical method is the contour-line approach shown in Fig. 3(a) 
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Fig. 3 — (a) Surface definition with even z-axis quantization (contour lines); 
(b) surface definition with even z, y plane quantization. 
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Here the z values are quantized into equal levels, and to any such z 
level the corresponding x and y values are taken (rounded to their 
nearest sample). This approximation yields uniformly distributed z 
values but uneven coverage of the x and y values. If the surface rises 
sharply toward the observer, the (x,y) points become densely packed, 
whereas if the surface becomes flat, these points become farther apart, 
resulting in gaps. Another possible approximation is shown in Fig. 3(b). 
Here the evenly sampled x and y values are taken and the corresponding 
z values are determined and rounded to their nearest sample. Hence, the 
surface to be displayed is defined by a dense covering of points obtained 
by projection up from the base plane. This second method of approxima- 
tion is chosen since the object is to be viewed from above, and since the 
dense covering will result in efficient use of the available stereo picture 
area. In the case of multiple-valued functions or several functions con- 
sidered in one display, the projection is made onto the maximum z value, 
which is the point closest to the observer. 

The surface of the object is thus defined but in a rather abstract sense. 
In order that the object be visible, brightness values must be assigned 
to every surface point. The use of identical brightness values for all 
points would yield a surface of homogeneous texture when viewed per- 
pendicular to the base plane. Such a surface would have no patterns, 
shadows, or brightness changes due to different angles of reflection; that 
is, it would have neither monocular nor binocular depth cues and thus 
would be inappropriate for the purpose. Therefore, to obtain depth cues 
each point must be printed at varying brightness levels. It has been 
shown? that stereo picture pairs comprised of points of random bright- 
ness and thus devoid of all cues except binocular parallax can be per- 
ceived in depth when fused. Therefore, it is sufficient to assign randomly 
to each point (2,y,z) a brightness level. The brightness selection on a 
random basis is a simple procedure, eliminating any consideration for 
appropriate monocular cues, and has the further advantage of avoiding 
periodicities and regions of ambiguities. That is, a point domain seen by 
the left eye may be fused with any periodically repeating domain seen by 
the right, if such exists, thus producing confusion as to the correct binoc- 
ular parallax. Therefore, random brightness patterns are used to produce 
unique point domains which can be fused unambiguously. The question 
of how many brightness levels to use in the random selection is answered 
by the requirements of the system of output to be used. However, the 
use of few levels increases the probability of occurrence of any one level, 
and clusters of points of equal brightness can produce areas of indetermi- 
nate depth on the surface to be viewed. For a photographic output proce- 
dure requiring a small number of levels, it would be desirable, therefore, 
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to apply rules to the random selection which would prevent these 
clusters. Also, a useful monocular cue could be provided by regulating 
the occurrence of certain brightness levels in a manner dependent upon 
the z-level of the point domain. Thus, there are many possible refine- 
ments to the basic procedure of giving texture to the surface by ran- 
domly assigning brightness levels. 

The object has been defined and its surface has been invested with 
brightness levels, albeit random and uninformative when viewed mon- 
ocularly. It remains now to produce a stereo pair in which the point 
domains are given the proper parallax shift. The calculations for two 
such pictures follow the simple formulas for projection, which are shown 
in lig. 4. The center of projection is considered at a distance H from the 
base plane of the object. The centers of projection for the stereo picture 
pair are separated by a base distance B and are positioned symmetrically 
about the z-axis. The plane of the pair is at a distance F from the centers 
of projection. The projections for each point (2,y,z) of the surface where 
z = f(x,y) onto the left and right members of the pair are then given by 
the relations 
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The total parallax for the point (x,y,z) is seen to be A = BIF/(H — 2) 
and is shared equally by the two pictures. 

It should be pointed out that binocular parallax alone constitutes 
only the perception of relative depth. Without other depth cues it is 
not possible to determine absolute depth when fusing the pair obtained 
by the above projections. That is to say, the perceived z-scaling, which 
is some monotonic function of A = const./(H — 2), is obtained by some 
arbitrary selection for the value H. (In stereoscopic viewing the sup- 
plementary depth cues determine the absolute distance of the plane of 
the stereo pictures from the observer, which is subjectively substituted 
for H.) If the function A = const./(H — z) is used, the parallax shifts 
will be similar to those experienced by the human optical system and 
thus will give rise to familiar percepts of z-scaling. Inasmuch as the 
perceived depth is some monotonic function of the binocular parallax, 
which is in turn a monotonic function of the height of the surface, it 
suffices to choose any monotonic function A = f(z). For example, if the 
range of z is limited, the function A = 2z gives a good approximation to 
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Fig 4 — Projection of an object onto a stereo pair. 


the projection of Fig. 4. This function merely provides a different sub- 
ective z-scaling. If a numerical z-scale is provided, which can be per- 
ceived in depth together with the surface to be presented, and if both 
are generated according to the same projection rules, then the problem 
of a correctly labelled stereoscopic projection is solved. 

Consequently, the parallax shift can be computed, having selected a 
function A = f(z), which gives a new position to each point, and an 
identical brightness level can be assigned at random to the corresponding 
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points in the left and right fields. The stereo pair then results by use of a 
suitable output medium, that is, a video transducer in which the 2,y 
positions correspond to the deflection, and in which the brightness values 
correspond to the intensity of the beam in a cathode-ray tube display. 
Inasmuch as a digital system is used, the video transducer generates a 
sampled display. Since the projection can produce expansions and con- 
tractions of point domains on the surface or parallax shifts which are not 
integral multiples of the sampling intervals, over-sampling is required. 
This, however, results in great strain on the storage capacity of available 
computers and on the resolution requirements of video transducers. 
Therefore, it is necessary to make compromises by trading resolution in 
object definition for resolution in depth. In the present state of tech- 
nology, however, there are devices available which will satisfy this 
requirement. 


Ill. INSTRUMENTATION 


The above steps were carried out by quantizing the base plane into 
10,000 points with the scale on the x and y axes running from 1 to 100. 
An IBM 7090 computer was used to generate an array of corresponding 
values z = f(x,y) and to assign a random number designating the bright- 
ness for each of the points. For simplicity, the function A = z was chosen 
for the parallax shift instead of the geometric projection and was ap- 
plied in the z-direction only. That is, the coordinates of the point 2,y 
in the left and right pictures were 


ty, = x+ 2/2 tr =u — 2/2 
and 
Yr = Y Yr= Y. 


This corresponds to projecting the stereo pair with a cylindrical lens, 
the axis of which runs parallel to the y-direction. This position and 
brightness information was then written on digital magnetic tape and 
put into a General Dynamics S-C 4020 microfilm printer, which served 
as the output device for the stereo pair. Different brightness levels were 
achieved by randomly employing each of the sixty-four type characters 
available on the microfilm printer. The variation in density of each of 
the characters gave sufficient variation in brightness level and provided 
an efficient means for plotting brightness information. The grid size of 
the microfilm output was 1024 x 1024 and provided, therefore, an 
oversampling of ten to one for the chosen picture size. This oversampling 
gave enough stereo resolution for most applications. In order that the 
type characters did not overlap and totally obscure each other, the 
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maximum shift permitted between two points was taken to be six micro- 
positions. This limited the total parallax shift to twelve units. That is, 
the maximum angle of rise between two points on the surface which could 
be displayed was approximately 85°. The total range of the surface was 
further restricted by the locations of the peaks and valleys relative to 
the side boundaries of the grid. For the examples to be shown, a scaling 
on the z-values of about 60 levels running from —30 to +30 was chosen. 

An alternative semi-automatic method of output using an optical 
system was also investigated. This technique resulted in the production 
of a solid model of the surface to be displayed, which was then photo- 
graphed by a stereo camera to obtain the desired picture pair. The model 
was prepared in layers by printing the points belonging to each of the 
quantized z-levels on transparent glass slides as black and white dots. 
The slides were then stacked together in register to form a solid cube, 
where the width of the glass plates determined the scale factor for the 
z-axis. The prints for each level were obtained by writing the picture 
information on magnetic tape in digital form as computer output and 
by using a digital-to-analog converter and a slow-speed television moni- 
tor to produce oscilloscope displays, which were then photographed.*:+* 
A secure mounting of the stack of glass slides in which the entire stack 
remained transparent was achieved by making an air-tight seal between 
each plate with a polyester resin having an index of refraction suffi- 
ciently near that of the glass.* This technique results, therefore, in a 
stereo pair in which the parallax shift corresponds to the geometric 
projection, and furthermore, gives rise to a solid model which is a de- 
sirable by-product. 


IV. RESULTS 


Example 1, generated and displayed automatically, is shown in Fig. 5, 
and can be perceived in depth when viewed stereoscopically. Viewing 
may be facilitated by use of Fresnel lenses accompanying the article 
cited in Ref. 2. The following three surfaces are presented: 


1) the hyperbotic paraboloid, 


e—50\? (y—50\)_ z 
30 20 ~ 80’ 


with saddle point at (50, 50, 0), 





* The slide mounting techniques were developed by R. A. Payne of Bell Tele- 
phone Laboratories. 
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Fig. 5 — Automatic stereoscopic presentation of a function (Example 1). 
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2) the elliptical paraboloid, 


x — 50 - y — 79\7 _ —(z — 20) 
10 20 7 50 
with vertex at (50, 79, 20), and 
3) the torus, 


(VA (a — 50)? + (y — 50)? — 42)? + 2 = 6", 


centered at (50, 50, 0) and having a radius of 6. (Conventional two- 
dimensional displays of Example 1 were given in Iigs. 1 and 2.) 

The reduction required for reproducing the stereo pictures here has 
made the resolution of individual type characters very difficult. For 
this reason, a presentation in depth of the numerical z-scale was omitted. 
However, a very effective display can be achieved, including the z-scale, 
by using a larger picture size. : 

In Tig. 6 the same surface is displayed by the optical method. Two 








Fig. 6—(a) Semi-automatic stereoscopic presentation of a function (Example 
1); (6) same as (a), but with increased z-axis scaling. 
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views are presented to show that variable scaling on the z-axis is possible. 
The amount of depth is determined by both the width of the glass 
plates and the base distance between the two lenses of the stereo camera 
(or distance between two positions of a single-lens camera). Fig. 6 (b) 
was produced with greater base distance, and consequently the surface 
has greater stretching in the z-direction. The numerals on the right edge 
of the displays indicating the z-levels were applied to the appropriate 
slides by hand and were not generated as picture material. It will also 
be pointed out that in this method, the points of the two pictures are 
more clustered and less uniform in distribution. This demonstrates the 
expanding and contracting of point domains produced by the trans- 
formation of projection and provides a helpful monocular cue. All dis- 
plays are far more evenly filled with brightness elements, however, than 
if the contour-line method had been used. 

Iixample 2 shown in Fig. 7 is that of a spiral given by the parametric 
equations 


x = pcos dé + 50.5 
y = psiné + 50.5 
pet G35 
T 
with 
05060 67 
15 

= 50 2 6: 

. a 2a 


This presentation is another display from the microfilm printer, illustrat- 
ing the procedure in its completely automatic form. Approximately one 
minute of time is required to generate and display the stereo informa- 
tion. 


Vv. SUMMARY 


A method for automatically presenting three-dimensional information 
in depth has been described. The advantages are threefold in that (2) 
such presentations make possible displays which are very difficult if not 
impossible to obtain by other means, (72) they carry the spatial impact 
enjoyed by human observers, (777) and they are suitable for the printed 
page. The technique has been outlined in three steps: definition of sur- 
face, texturing of the surface with brightness elements, and generation 
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Fig. 7 — Automatic stereoscopic presentation of a function (Example 2). 
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of stereo projections of the surface. By use of digital computers and the 
special-purpose output’ devices now available, this procedure can be 
carried out in a completely automatic fashion, thus making possible a 
simple and effective demonstration of three-dimensional data. 
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Maximization of the Fundamental 
Power in Nonlinear Capacitance 


Diodes 


By J. A. MORRISON 
(Manuscript received October 6, 1961) 


In this paper we consider the problem of determining the maximum 
fundamental power in a nonlinear capacitance diode, when the charge 
waveform has a given periodicity and (2) varies between prescribed maxi- 
mum and minimum values, (ti) has a prescribed maximum and a pre- 
scribed maximum slope. Under (7) the maximum obtainable fundamental 
power 1s first determined. The charge waveform is then further restricted to 
contain no higher than second harmonics, so that the diode is being used as 
a frequency doubler, and the maximum power transfer ts determined. The 
maximum power transfer is also determined under (it). Particular diodes 
considered are the abrupt-junction and the graded-junction ones, with oper- 
ation in the forward conduction region being permitted. 


I. ENGINEER’S SUMMARY 


This section of the paper is a summary which stresses some of the 
contents of the introduction and summary that follow. It is hoped that 
this will make it easier for the engineer who is involved in parametric 
amplifier and varactor design to deduce the relevant applications of the 
results contained in this paper. 

In the first instance it should be emphasized that an idealized problem, 
based on a mathematical model, is considered. The nonlinear capacitor 
is assumed to be isolated from any external circuits, and we do not dis- 
cuss how the power is fed into or taken from the device. Clearly there 
will be some power lost in the external circuit, and the maximum ob- 
tainable fundamental power determined in this paper is only a theoretical 
maximum, but it would seem to be worthwhile to understand this 
theoretical maximum. When the maximum power transfer from the 
first to the second harmonic is considered, the charge waveform, and 


677 


678 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


hence the current, giving this maximum is determined. Clearly there is 
some relative phase between the first and second harmonics in the cur- 
rent, and the reactance of the output circuit must be adjusted so as to 
obtain this relative phase. 

It is also important to stress that some of the results obtained hold 
for a general, i.e., arbitrary single-valued, voltage-charge relationship, 
and are accordingly applicable to any particular such voltage-charge 
relationship in which the engineer may be interested. We have, for 
simplicity, considered just the abrupt-junction and the graded-junction 
diodes as special cases, and have idealized the voltage-charge relation- 
ship in the forward conduction region, but other particular diodes can 
be considered as special cases of the general results. We discuss below 
the results which are pertinent to the general voltage-charge relationship. 

Firstly, we have derived the functional form of the charge waveform 
(of given periodicity and varying between prescribed values) which gives 
the maximum power in the fundamental. The charge waveform is 
composed (see (33) below) of intervals in which it takes on either the 
maximum or minimum prescribed value, or else follows a certain curve. 
The form of the curve depends on the voltage-charge relationship and 
involves parameters which are functionals of the charge waveform 
throughout the entire period, and hence are not known a priori. These 
parameters have to be determined for each particular voltage-charge 
relationship, by solving simultaneous transcendental equations. It is also 
necessary to allow for finite jumps in the charge waveform, and (86) 
below must hold at such a jump. Of course, a jump is not physically 
realizable, since it would correspond to an infinite current, and this 
makes it evident that the maximum is a theoretical one, quite apart 
from losses in the external circuit. It does, however, provide an upper 
bound on the maximum realizable fundamental power. 

In view of the fact that the maximum fundamental power has to be 
determined separately for each specific diode, we derive upper and lower 
bounds for the maximum fundamental power, (11) to (18), which apply 
to a general voltage-charge relationship. For a wide class, the ratio of 
the upper to the lower bound is 1.54. It turns out that, for the particular 
diodes considered, the lower bound is quite close to the actual value. 
Further use is made of the charge waveform giving this lower bound, 
when the power transfer from the fundamental to the second harmonic 
is considered, subject to the charge waveform containing no higher than 
second harmonics. A good approximation to the maximum power 
transfer is obtained by taking the Fourier approximation, up to second 
harmonics, and suitably normalizing so that the approximating charge 
waveform has the prescribed maximum and minimum values. 
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In connection with maximizing the power transfer from the funda- 
mental to the second harmonic, we consider the diode to be a harmonic 
generator, there being input power in the fundamental only. In order to 
make the mathematical problem more tractable, it is supposed that the 
entire output is in the second harmonic. Equations (18) and (19) simply 
state that the maximum power output in the second harmonic, when 
there is input power in the fundamental only, is not greater than the 
maximum obtainable fundamental power without such restrictions, 
and is not less than the maximum fundamental power when there is no 
output or input power in the third and higher harmonics. It is assumed 
here that the charge waveform is continuous. We have already discussed 
the maximum obtainable fundamental power. 

The problem of determining the maximum fundamental power when 
there is no output or input power in the third and higher harmonics is 
still not very tractable, without additional restrictions on the charge 
waveform, and it is thus further supposed that the charge waveform 
contains no higher than second harmonics. The maximum subject to 
this additional restriction is obviously not greater than the maximum 
without it. The significant point about this restriction is that there is 
then no power output or input in the third and higher harmonics, what- 
ever the voltage-charge relationship. We thus determine a canonical 
representation of the charge waveform which contains no higher than 
second harmonics and has prescribed maximum and minimum values. 
By suitable choice of the time origin, this representation contains just 
two parameters which lie in a bounded region. 

Now, it is a straightforward matter to compute numerically the funda- 
mental power for any given voltage-charge relationship and a given 
charge waveform. The numerical maximization of this power with re- 
spect to the two parameters in the above canonical representation is 
also a straightforward process. Thus it is clear that the above procedure 
has general applicability. We add that in the numerical maximization 
process, the two parameters which give the approximating charge wave- 
form (obtained from the charge waveform giving the good lower bound 
to the maximum obtainable fundamental power) are used for starting 
values. 

Consideration is also given to the current-limited diode, in which the 
charge waveform has a prescribed maximum value and a prescribed 
maximum slope (corresponding to maximum current magnitude). 
Again, we determine a two-parameter canonical representation for the 
charge waveform containing no higher than second harmonies, and the 
numerical maximization of the fundamental power, for any given volt- 
age-charge relationship, proceeds along the same lines as in the previous 
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case, except that we no longer have predetermined starting values for 
the two parameters. Lack of space has prevented inclusion of the de- 
termination of the functional form of the charge waveform which gives 
the maximum obtainable fundamental power (without restriction on 
the harmonic content of the charge waveform) in the current-limited 
case. 


II. INTRODUCTION AND SUMMARY 


2.1 Introduction 


We will be concerned with various nonlinear capacitance diodes, these 
being characterized by a nonlinear voltage-charge relationship. Specific 
examples are the abrupt-junction diode and the graded-junction diode, 
which are composed of diffused p-n junctions. In the former case the 
voltage difference, v, across the diode is proportional to the square of 
the stored charge (per unit area), g, i.e., v « g’, while in the latter case 
y « gq’, provided, in both cases, that g = 0, which implies that operation 
of the diode does not take place in the forward conduction region. Now 
as electric field strength and barrier width increase, creation of electron- 
hole pairs through secondary impact ionization by both holes and clec- 
trons leads to avalanche multiplication, resulting finally in an effectively 
infinite increase of current with added applied voltage, and this is termed 
reverse breakdown. There is thus a maximum voltage Umax , and a cor- 
responding maximum value gmax of the charge density (which may be 
related to Umax through the actual voltage-charge relationship), above 
which it is not desirable to operate the diode. 

We define the normalized voltage V and the normalized charge Q by 

v 


V=—; Q=-. (1) 


Umax (max 








Hence the normalized voltage-charge relationships for the abrupt-junc- 
tion and graded-junction diodes, operated in the region between forward 
conduction and reverse breakdown, are 


- Q, (abrupt) 
Q?, (graded) 
It is also possible to operate the diodes partially in the forward conduc- 
tion region, corresponding to Q < 0. The voltage is not very dependent 
on the charge in this region and as an idealization we may assume that 


it is zero throughout. A physical restriction is placed on the maximum 
possible current magnitude, in that the electron velocity is limited by 


=Q 


IIA 
— 


(2) 
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lattice scattering. Throughout most of our analysis we replace this 
condition by a limitation on the minimum charge, so that 


q = Gmin = —M(Gmax)- (3) 
Thus, in the forward conduction region, 


V =0, —m=Q<s0. (4) 


‘We do, however, give some consideration to the current-limited diode 
in which, instead of (3), 


eae freee (5) 


We will consider charge waveforms that are periodic in time, ¢t, with 
angular frequency w. We define the normalized time x and the normalized 
current I by 

a 
x = at; [= (6) 


@WYmax 





Thus Q(x) is periodic in x with period 27 and, since 2 = dq/dt, 


p= B= Qe). (7) 
xv 


The average real and reactive powers (per unit area) in the nth har- 
monic, p, and r, , are given by 


* 1 ® ; we - —jwnt a jont 
(Da + ja) =3(¢) (/ Ze it) (J ve it). (8) 


We define the normalized real and reactive powers in the nth harmonic, 
P, and #, , by 


Qn" (Dn a qT) 


®WQmaxUmax 


(P, + jRn) = (9) 
We will be concerned with the maximization of the real fundamental 
power, under various conditions, and summarize the results below. We 
note that P, is not affected by a time shift in the charge waveform, but 
it is reversed in sign by a time reversal of the waveform. 


2.2 The Maximum Obtainable Fundamental Power, When the Charge 
Waveform is Subject to Bounded Variation 


The functional form of the charge waveform which, subject to the 
restriction —m S Q(x) S 1, maximizes the fundamental power, P, , is 
found for the general voltage-charge relationship, V = V(Q). The 
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specific form is determined for diodes of interest and the corresponding 
value of max P, , the maximum obtainable fundamental power, calcu- 
lated. Thus, for the abrupt-junction diode operated in the region between 
forward conduction and reverse breakdown, (2), max P; = 0.687 and 
the charge waveform Q(x) giving rise to this value is depicted in Fig. 1. 
The corresponding value of the reactive fundamental power is R, = 
2.43. For the graded-junction diode, operated in the region between 
forward conduction and reverse breakdown, it is found that max P; = 
0.408, with R, = 2.48. The charge waveform giving rise to these values 
is depicted in Fig. 2. The abrupt-junction diode is also considered when 
the region of operation includes forward conduction. Thus, from (2) 
and (4), V(Q) = [max(0,Q)), —m < Q(x) < 1. Fig. 4 depicts max P, 
and the corresponding FR, as functions of m. The charge waveform Q(x) 
which gives these values when m = 1 is shown in Fig. 5. The somewhat 
idealized voltage-charge relationship given by V(Q) = max (0,Q), 
—m < Q(x) < 1, m > 0, may be treated analytically. It is found in 
this case that 


max P, = 3V5 R, = : (m + 2). (10) 
The charge waveform giving these values is composed of Q(x) = 1, 0 
and —m in consecutive intervals of x of length 27/3. 

It is observed that the charge waveform which gives rise to max P, , 
for the various diodes, contains at least one discontinuity (or jump) in 
a period. A jump, of course, is not physically realizable, since it would 
correspond to an infinite current, so max P; cannot actually be attained. 

Finally, upper and lower bounds are obtained on the maximum ob- 
tainable fundamental power, max P;, for the general voltage-charge 
relationship V = V(Q), with —m S Q(x) S 1. Thus, it is shown that 


ey < max P, < 4(1 + m)U, (11) 


where 


L= max l(e—7)V(o) + (x — o) Ve) + (0 — 9) V(r)5, (12) 


—ms (9,947) s 


and 


U = min { max [Ao — V(o)} — min Do — V(c)}}. (13) 


—mso< 


Moreover, it is shown that 


L<s(1+m)U S 2. (14) 
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The bounds in (14) cannot be improved without restriction on V(Q), 
but if [(o + m)V(1) — a + 1)V(p) + (1 — p)V(—m)]-does not 
change sign in —m S p S 1, then L = (1 + m)U and the ratio of the 
upper to the lower bound in (11) becomes 1.54. The class of voltage- 
charge relationships 


V(Q) = [max (0,9), -m S$ QS1;  m2=0, »21, (18) 


which includes the particular diodes considered, satisfies the above 
condition, and in this case 


L=m+(1—*) (a+ more. (16) 


Tor the particular cases considered, the lower bound in (11) is fairly 
close to max P,. 

A lower bound is also obtained, for a general voltage-charge aes 
ship V = V(Q), with —m S Q(x) S 1, for ous such that P; + P, = 
It is shown that 


max [P;| P; + P2 = 0] = (1.87)L. (17) 


2.3 The Maximization of the Power Transfer in a Frequency Doubler, 
With Bounded Charge Waveform 


Here we are interested in maximizing the power transfer from the 
fundamental to the second harmonic, when the diode is being used as 
a harmonic generator. Thus there must be input power at the funda- 
mental frequency only, i.e., P; > O and P, S 0, 2 2. In order to make 
the problem more tractable we suppose that the entire power output is 
put is in the second harmonic, so that P, = 0, n = 3. It follows that 
P, + P, = 0, provided that the charge waveform is continuous, since 
then >.*., P, = 0. We observe that 


max [—P2|P, 50, n23] 


S max [P,|P, $0, n 2 3] S max P,, co 
and 
max[—P:|P, $0, n23]2max[—P.|P,=0, n2 3] 16) 
= max :[P) P= 0. 223). 


Even the problem of determining max [P;|P, = 0, n 2 3], that is, 
max P, subject to P, = 0, n 2 3, is not very tractable, without addi- 
tional restrictions on the charge waveform. Thus, it is supposed that 
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the charge, and hence the current, contains no higher than second har- 
monics. The conditions P; + P, = 0 and P, = 0, n 2 3, are then iden- 
tically satisfied, independently of the voltage-charge relationship V = 
V(Q). 

Now, a change in the time origin does not affect the power transfer. 
Hence, the canonical representation of a charge waveform which con- 
tains no higher than second harmonics and is such that Q(7) = Qmin = 
—m and Q(2 tan™ s) = Qmax = 1, is constructed. In addition to the 
parameter s there is the parameter y which is subject to the restriction 
0 < y S (1 — 8’), which of course also implies that s’ < 1. It is found 
that P; = 0 on y = O and on y = (1 — s’), independently of the vol- 
tage-charge relationship. Moreover, P,(s,y) = —P.i(—s,y) and in 
particular P; = 0 on s = 0 also, so that it is sufficient to consider only 
the region —1 S$ s S$ 0,0 S y S (1 — 8°) and to maximize | P, |. The 
abrupt-junction diode, operated in the region between forward conduc- 
tion and reverse breakdown, may be treated analytically, and it is found 
that the maximum power transfer is 0.281, as compared with the max- 
imum obtainable fundamental power of 0.687. The corresponding re- 
active fundamental powers are 1.46 and 2.43, and the charge waveform 
giving the maximum power transfer is depicted in Fig. 6, which should 
be compared with Fig. 1. 

In order to determine the maximum power transfer for a general 
voltage-charge relationship, recourse must be made to numerical com- 
putation. However, a prior step is the determination of a charge wave- 
form which provides a reasonable approximation to the maximum power 
transfer, and hence provides starting values for s and y in the numerical 
maximization process. A good lower bound was obtained for the max- 
imum obtainable fundamental power. Furthermore, for a wide class of 
voltage-charge relationships V = V(Q), the charge waveform Q(z) 
giving this lower bound satisfies Qmax = 1 and Qnin = —m. The class 
of voltage-charge relationships (15) falls within this class. Thus it would 
seem feasible that a reasonable approximation to the maximum power 
transfer will be obtained by taking the Fourier approximation, up to 
the second harmonics, of the charge waveform giving the good lower 
bound for the maximum obtainable fundamental power, and suitably 
shifting and expanding (or contracting) the Fourier approximation so 
that the resulting charge waveform Q(x) satisfies Qmax = 1 and Qmin = 
—m. This is the procedure adopted and, for the abrupt-junction diode, 
operated in the region between forward conduction and reverse break- 
down, it actually yields the charge waveform that gives the maximum 
power transfer, 
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The results of the numerical maximization process are tabulated in 
Section 5.4. Tables IT and III, for the cases » = 2 and vy = 3, in (15), show 
the values of maz P,; , the maximum power transfer, and the correspond- 
ing values of R; and R,, the reactive powers in the fundamental and 
second harmonic, Jmax , the maximum normalized current magnitude, 
and (b° + c¢’) and (d’ + e’), the squares of the amplitudes of the first 
and second harmonies in the charge waveform, for several values of m. 
Tables IV and V show the values of —s and y which give maz P, and 
also y and P;, the value of P, corresponding to the starting values 
y? and —s” = 1/+/3. It is interesting to observe how close P;" is 
to max P;, particularly for the smaller values of m. Table VI compares 
max P, with the maximum obtainable fundamental power, max P; , in 
the case vy = 2, for several values of m. It is also worth noting that in 
the case v = 3, m = 0 we have maz P; = 0.162, whereas max P, = 
0.408. 


2.4 The Maximization of the Power Transfer in a Frequency Doubler, 
for the Current-Limited Diode 


We finally turn our attention to the current-limited diode in which 
(5), instead of (3), holds. Thus, from (5) to (7), 





LOG) ee, (20) 
(wQmax) w 
Tor the P*N abrupt-junction diode of germanium!” 
Vmax & 1.03 X 10°(N)~ volts, 
(21) 


line Oe 16 X 10-°N amps/cm’, 


. . . 8 
where N is the donor concentration in em". But, from the voltage- 


charge relationship, 





Gas. = 2ee€ Nn ) (22) 
where e denotes electron charge. Hence, 
Qmax 2.16 X 10°°(N)"*” coulombs/em’, (23) 
and 
k= a ~ 0.74 X 10°(N)°S sec) (24) 


For N = 2 X 10"°, a reasonable value, x ~ 10" sec’, which is in the 
range of angular frequencies of interest. 


686 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH. 1962 


We consider the problem of maximizing the power transfer from the 
fundamental to the second harmonic, when the diode is being used as 
a frequency doubler, and, as previously, the additional assumption is 
made that the charge waveform Q(a), and hence the current, contains 
no higher than second harmonics. The first step is the construction of 
the canonical representation of Q(x) such that Qmax = 1, Q’(7) = 
Qrin = —kand Q’(2 tan’ s) = Qrax < k. In addition to the param- 
eter s there is the parameter y which is subject to the restriction 0 < 
y < 4(1 — s°), which of course also implies s’ S 1. It is found that 
P, = Oony = 4(1 — 8°), independently of the voltage-charge rela- 
tionship. Since, if Q(z) = Q(a — x), then Oimax = 1, Qmax = k and 
Omin = —k, it is sufficient to consider the above canonical representa- 
tion and to maximize | P; |, in order to maximize P, subject to Qmax = 
1, | Q’ |max = k&. We denote this maximum by II(&). For the abrupt- 
junction diode operated in the region between forward conduction and 
reverse breakdown, the determination of II(/) is carried out analytically 
for k sufficiently small that Qmin = 0. It is found that I(k) = 0.731k", 
for 0 S$ k S 0.681. Combining this result with that obtained when the 
charge waveform is subject just to bounded variation, 0 S$ Q(x) < 1, 
it is shown that, from the viewpoint of maximizing the actual funda- 
mental real power p; , the optimum operating frequency lies in the range 


1.299 < (dmx) < 1.468, (25) 


max 


and that 


1 x DACmax pi) < 2(4)" © 1 96. (26) 
ClincsO ane) 3 





For the abrupt-junction diode which is allowed to operate partly in the 
forward conduction region, the maximization of the power transfer is de- 
termined by numerical computation. For the values of s and y which give 
max | P;|, ie., II(k), the reactive powers PR, and Re, and Qmin, ie., 
— M(k), were calculated, the results being given in Table VII (Section 
6.4). It is shown that max P; subject to Qmax S land | Q’ |max S k is at- 
tained with Qmax = land | Q’ |max = &. For k < 0.681 it can also be at- 
tained with 1.468k S Qmax < land | Q’ |max = k. Optimizing with respect 
to the frequency it appears that 20(max pi) ~ tmaxYmax . Thus a consider- 
able improvement is obtained by permitting operation in the forward 
conduction region. The optimum frequency in this case is roughly one- 
fifth that in the case when operation is not allowed in the forward con- 
duction region, although close to max p,; may be obtained at one-third 
the frequency. 
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In conclusion, we add that lack of space has necessitated the omission 
of several aspects of this problem, and in particular of the determination 
of the maximum obtainable fundamental power when the periodic 
charge waveform is restricted only to have bounded slope. 


Ill. THE CHARGE WAVEFORM WHICH, SUBJECT TO BOUNDED VARIATION, 
MAXIMIZES THE POWER IN THE FUNDAMENTAL HARMONIC 


3.1 The Functional Form of the Charge Waveform 
From (1), (6), (7), (8) and (9), 


Pop (| "O'lx) o™ ix) (/ " VIQ(x)] e™ in). (27) 


It is noted that P, is not affected by a time shift in the charge waveform 
Q(a), but it is reversed in sign by a time reversal of the waveform. On 
the other hand, F,, is not affected by either a time shift or a time re- 
versal in the charge waveform. Integrating by parts the first integral 
in (27), and remembering that Q(2) is periodic with’ period 27, and 
then separating real and imaginary parts, 


Pr = Nandn — Bn¥n)3 Rn = N(anYn + Bndn), (28) 
where 
Qr 2x 
Qn = Q(x) sin nx da; Bn = Q(x) cos nx da; 
2 a (29) 
Yn = J V([Q(x)] sin nz dz; 6, = ‘ V[Q(x)] cos na dx. 


Irom (28) and (29) we may express P,, as a double integral, 
1 2a 2a 
PP.= ff @@)vi@yl sin nw = y) dz dy. 0) 
0 0 


To find the functional form of Q(a2) which, subject to the restriction 
—m S Q(r) 31, (31) 

maximizes P, , we set 
Q(x) = [(1 + m) sech R(x) — ml, (32) 


so that the inequalities in (31) are satisfied. A variational procedure 
applied to (30) then shows that for stationary values of P,, we have, 
for each zx, , pee Oe 
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Q(x) = —m, or Q(x) = 1, or 


itary _. (¥1 COS & — 6; sin 2) (33) 
VCC eBay 


where a , 6: , ¥:, and 6, are as defined in (29). This, then, is the func- 
tional form of Q(x) which maximizes P;. Evaluation of the integrals 
in (29) will lead to four equations for the four unknowns a, 61, ¥1, 
and 6, . Note, however, that 


d E cos 2 — 6, sin 2 a2 (2161 — 8171) 


dx |_(a, cos x — 6, sin x) (a, cos & — 6; sin x)? 











_p, (34) 


~ (a, cos x — 61 sin 2)?’ 





from (28), is of one sign. Since we are not interested in P; = 0, which 
case arises in particular if Q(~) = const, it follows that allowance must 
be made for discontinuities in Q(x), since we require that Q(x) be peri- 
odic. Supposing that Q(x) is discontinuous at « = ¢, we obtain a con- 
dition by integrating the equation 


VOG@)idG)-= es ene) 9), (35) 
(a1 cos ¢ — B, sin x) 
fromz = ¢—0tox = 9+ 0. This gives 


e+o _ (71 Cosy — 4 sin ¢g) +0 
[VIQ(x)]]e20 = Gio Sane) [O(x)]g*o . (36) 


3.2 The Charge Waveform for the Abrupt-Junction Diode 


In normalized form the voltage-charge relationship for the abrupt- 
junction diode operated in the region between forward conduction and 
reverse breakdown is 


VQ=@, 085 Asx) $1, (37) 


so that m = O in (31). We make use of the fact that P, is invariant 
under the transformation Q(2) > Q(x — @), and choose 6 so that 6, = 
0, since this leads to a simplification of the analysis. Let us define a and 
b by the equations 


1 = 2aa,, 6; = 2ba;; B, = 0. (38) 

Then, from (28), 
Py = 2bay’. (39) 
It is clear that max P; > 0, and hence that b > 0. The functional form 
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of Q(x) for max P, is, from (33), (37) and (38), 
Q(z) = 0, or Oe) = 1, or Q(x) = (a— btanz). (40) 


Rejecting combinations which lead to P; = 0, we are led to the con- 
clusion that, within a cycle, Q(x) = 1 for an interval, it then follows 
the curve Q(x) = (a — b tan x) and then Q(x) = O for an interval, 
after which it jumps from 0 to 1 and the cycle is repeated. 

Let ¢ be a value of x at which a jump in Q(2) from 0 to 1 occurs. 
Then (86), (37), and (38) give 


(2a — 1) 


tang = 55 (41) 
Thus we obtain max P,; by taking 
1, for g<xSm+ tan" [(a — 1)/d); 
Q(z) = (a — btanez), for m+ tan” aa 1)/b] S$ x (49) 
< 2+ tan” (a/b); 
[ 0, for s+ tan? (a/b) $2 <2n +4, 
where 
= < tan? [(a — 1)/b] <@ < tan’ (a/b) <2, (43) 
and 
Q(a + 27) = Q(x), all x. (44) 


Now a1, 61, ¥1, and 6, may be calculated from (29), (387), (42) and 
(44). Substitution into (88) then leads to 


(2a — 1) cose = 2b{[(a — 1)° + BY} — a’ + 8); 
2b cosy + sing + 3b7 = {(a + 1)[(a — 1)? + D7} 
— ala’ + 0°); 
sing = {[(a— 1) +0} — (@ +BY}, 


(45) 


where 
7 = b{tanh’ {a(a’ + B’) 4} — tanh '{(a — 1)[(a —1)? + 07]. (46) 


It would appear that we now have one too many conditions on a, b and 
y because of the relationship in (41), which was obtained from the jump 
condition at x = g, but it is observed that the first and last equations 


690 THE BELL SYSTEM TECHNICAL JOURNAL, MARCH 1962 


in (45) are consistent with (41). Since | ¢| < 7/2 and b > 0, (41) gives 
cos = 2b[(2a — 1)? + 40°77; 
sing = (2a — 1)[(2a — 1)? + 40°777°. 
Substituting into the first equation in (45), we obtain 
(2a — 1)[(2a — 1) + 46°" = {[(a — 1)? + BP — (a? + By}. (48) 


A solution to (48) is a = 4 and, moreover, this is the only solution 
since if a >3 the L.H.S. > 0 and the R.H.S. < 0, and vice versa. Thus, 


a=% ¢=0. (49) 


The second equation in (45), using the definition of 7 given in (46), 
now leads to an equation for b, namely 


3b” tanh" [(1 + 407) ?] = [2(1 + 48°)? — 8], (50) 


and (39) and the expression for a; give 


(47) 


P, = 2b{1 + 2b tanh [(1 + 4b°)]}? = + foo + 1 + 48), (51) 


~ 186 
using (50). Equation (50) was solved numerically and it was found that 
b = 0.14136; max P, = 0.6868. (52) 


The shape of Q(2) which gives this maximum value of P; is shown in 
Vig. 1. From (28) and (38) the corresponding reactive fundamental 
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Big. t= Charge waveform for maximum obtainable fundamental power in 
abrupt-junction diode operated in the region between forward conduction and 
reverse breakdown. 
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power is given by 


R, = 2aa, = 7b) = 2.429, (53) 


Note that the reactive power is about three and a half times as large 
as the real power. 


P; 
2 


3.3 The Charge Waveform for the Graded-Junction Diode 


We now turn our attention to the second diode of interest, namely 
the graded-junction diode, and suppose that it is operated in the region 
between forward conduction and reverse breakdown. In normalized 
form the voltage-charge relationship is 


VQ) =, OS Az) S1. (54) 


The determination of the maximum obtainable fundamental power, 
max P,, is carried out along the same lines as for the abrupt-junction 
diode, although the details are more involved. The analytical form of 
the charge waveform Q(a) which gives max P, is 
1, for ~<xSar+tan ’[(a — 1)/b]; 
(a —btanz)’, for ++ tan'[(a —1)/b] Sax 
Q(z) = = (55) 
<27+tan (a/b); 


0, for ++ tan (a/b) Sx <2r+y, 


IIA 


where 
v —1 —1 v | 
5 < tan” [(a — 1)/b] < ¥ < tan (a/b) < 5) (56) 
and (44) holds. Here 
3 
‘1 = 5 aay ; 61 = : bay ; Bi — 0, (57) 


which leads to three equations for a, b and y. These equations are con- 
sistent with the jump condition (36) which gives 


tan y = (8a — 2)/(8b). (58) 


Elimination of y leads to two equations for a and b which were solved 
numerically, giving 


b = 0.11098; a = 0.67375. (59) 
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These values lead to 
max P, = 0.4084; Ry = 2.479. (60) 
The corresponding charge waveform Q(x) is depicted in Fig. 2. 
3.4 The Abrupt-Junction Diode When the Region of Operation Includes 
Forward Conduction 


In this case the normalized voltage-charge relationship is, from (2) 


and (4), 
V(Q) = [max (0,Q)/, -mS Q(x) S$ 1, m>O. (61) 


As previously, we translate Q(x) so that 6; = 0 and again define a and 
b by (38), so that (89) for P, also holds. From (33), (38), and (61), 
the functional form of Q(x) for max P, is 


Q(x) = —mM, or Q(x) = ty 


(62) 
or max [0,Q(x)] = (a — b tan z). 


Thus we are led to the conclusion that within a cycle Q(x) = 1 for an 
interval, it then follows the curve Q(a) = (a — b tan x) until the point 
at which Q(2) = 0 where it jumps to the value —m, and after Q(x) = 
—m for an interval it jumps to the value 1 and the cycle is repeated. 
Thus in this idealized case there are two discontinuities in Q(x) in one 
cycle. Note that according to (86), together with (38), the jump of 
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Fig. 2— Charge waveform for maximum obtainable fundamental power in 
graded-junction diode operated in the region between forward conduction and 
reverse breakdown. 
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Q(x) from 0 to —m occurs at x = y where (a — 0 tan y) = O, since 
V(0) = Oand V(—m) = 0. Hence we obtain max P, by taking 
1, for g9<xSnr+tan ‘[(a — 1)/d]; 

(a —btanzx), for + tan’ [(a — 1)/b] <2 
Q(x) = 7 (63) 
< a+ tan (a/b); 

—m, for m+ tan’ (a/b) <2 <2r+¢, 


IIA 


where (43) and (44) hold. In this case the jump condition at x = ¢, 

gives 

[2a(1 + m) — 1] 
2b(1 + m) 





tang = (64) 

The calculation of a, 81, 71, and 6,, and substitution into (38), 
leads to three equations for a, b, and ¢, which are consistent with (64). 
The elimination of g leads to two equations for a and b, which quan- 
tities of course are functions of m. It was found to be possible to elim- 
inate m analytically from these two equations, so that instead of solv- 
ing the two simultaneous equations for a and b for given values of m, 
the single relation between a and b which did not involve m was solved 
for b for given values of a. Thus a parametric solution was obtained in 
the form b = b(a), m = m(a). From this a and b were plotted graph- 
ically against m and the results are shown in Fig. 3. It was shown ana- 
lytically that 


1 <4a(1 +m) <2, (65) 


the upper bound being attained for m = 0 and the lower bound being 
approached for m > o. Also, asm > o it is found that 


b~ 3a; max P, ~ ~ m; Ri ~ : m, (66) 





where FR, is the reactive power in the fundamental. Fig. (4) shows 
max P, and the corresponding FR, as functions of m. It is interesting to 
note that the ratio (max P,)/R, increases with increasing m from its 
initial value of 0.28, its asymptotic value being 1/3, from (66). The 
charge waveform Q(x) giving rise to max P, is shown, for m = 1, in 
Fig. 5. 


3.5 The Charge Waveform for an Idealized Voltage-Charge Relationship 


We now consider a special voltage-charge relationship which may be 
handled analytically. Thus we suppose that the capacitance has a finite 





Fig. 3 — Parameters in charge waveform for maximum obtainable fundamental 
power in abrupt-junction diode operated partly in forward conduction region, vs. 
minimum charge. 
































Fig. 4 — Maximum obtainable fundamental power, and corresponding funda- 
mental reactive power, in abrupt-junction diode operated partly in forward 
conduction region, vs. minimum charge. 
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Fig. 5 — Charge waveform for maximum obtainable fundamental power in 
abrupt-junction diode operated partly in forward conduction region. 


constant value for reverse bias and is infinite for forward bias, and 
hence in normalized form | 


V(Q) = max (0,Q),  -mS Q(x) S11; m>0O. (67) 


Since V’(Q) is constant except possibly at Q = 0, where it is indeter- 
minate, we deduce from (33) that Q(x) has one of the values 1, 0, and 
—m at each point. Omitting further details, it is found that max P;, is 
given by 


1, 0 <2 < 27/38; 
Q(x) = 40, 27/3 <a < 4r/3; (68) 
|—m, 40/3 <a < 2Qz. 
Also, 
max P, = aw m:; R= : (m + 2). (69) 


Note that, as might be expected, these values are asymptotically, as 
m — o, the same as for the voltage-charge relationship in (61), as is 
seen from (66). 
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IV. BOUNDS ON THE MAXIMUM OBTAINABLE FUNDAMENTAL POWER ~ 


4.1 Lower Bounds 


We now derive some lower bounds for the maximum obtainable power 
in the fundamental, for a general voltage-charge relationship, by the 
simple expedient of choosing specific charge waveforms. Any P,; which 
we obtain is, of course, a lower bound for max P;. Thus we consider 
the charge waveforms 


o, onl’,; 
Q(x) = \p, onTs; (70) 
7, on T;, 


where each T'j(7 = 1,2,3) is a finite collection of nonintersecting inter- 
vals, open at the left and closed at the right, and furthermore 


ry N Tr, = 0, 9 - is Ty = (0,27]. (71) 


a. 
| Ce 
LA 


From (28), (29), (70) and (71), 


P,, = nL(o,p,7) IC cos nx az) (J sin nx iz) 
ry T's 
— (/ sin nx i) (J COs NX a) , 
Tr, Ty 
where 


L(a,p,7) = [(o — 7)V(o) + (tr — o)V(p) + (o — p) V(r). (73) 


The significant point here is that we can choose the intervals I’; and 
I, to make P, as large as possible, for the waveform class of (70), in- 
dependently of the functional form of the voltage-charge relationship, 
V = V(Q). This is still true if we wish to make P, as large as possible 
subject to the condition P; + P2. = 0, say, since the factor containing 
V, namely L(o,p,7), occurs in each P,,. Note, from (73), that L(o,p,7) 
vanishes unless o, p, and 7 are unequal. Also, if (c,p,7) undergo a cyclic 
permutation then L(c,p,7) is unaltered, but if (¢,p,7) undergo an anti- 
cyclic permutation then L(c¢,p,7) is reversed in sign. We suppose that 
the charge waveform has bounded variation as in (31) and define 

L= max [L(o,p,7)] = 0. (74) 


—m<(o,p.7)<1 


(72) 
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Also from (72) it is seen that P, changes sign if Ty and YT, are inter- 
changed, which is equivalent to an anticyclic permutation of (¢,p,7). 

Thus we are interested in making the modulus (or magnitude) of the 
bracketed expression following the factor L(c,p,r) in (72) as large as 
possible, in order to obtain as large as possible a lower bound for 
max P,. We will restrict ourselves to special [; and find the maximum 
modulus of the bracketed expression in (72) for these subclasses. In par- 
ticular, we consider 


TY, = (0,A]; ls =: (u7],, 0 <A Sp << e < On, (75) 
Then, from (72), 


P= : L(o,p,7) F(nd,np,nv) , (76) 


where 
F(\,u,v) = [sin (vy — X) — sin (uw — A) + sing — sin »] 
4 sin [(v — p)/2] sin (X/2) sin [(v + » — A)/2]. 


We first set \ = w and determine » and » to maximize F'(p,yu,v) which 
from (75) and (77) is seen to be positive. The stationary values of 
[sin (vy — w) + sin p — sin »] are given by 


(77) 


COS # = cos (u — v) = cos». (78) 


Hence F(u,u,v) is a maximum for p = 27/3, v = 4/3 and from (74), 
(76), and (77) the corresponding maximum of P, is 


Pia Se (79) 

| 2 
Now for the voltage-charge relationship (87) it is readily verified that 
L(o,p,7), as defined in (73), has a maximum value of } which is attained 
foro = 1, p = 4, 7 = O, and hence in this case we obtain the value 
P, = 0.650, which is quite close to the value of max P, given in (52). 
We now consider the maximization of F(\,u,v) subject to the con- 

dition 

P(,u,v) + 3 (20,2u,27) = 0, (80) 


corresponding to P; + P, = 0. Using the second part of (77), (80) 
becomes 


1 + 4 cos [(v — »)/2] cos (A/2) cos [(v + w — X)/2] = 0, (81) 
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supposing that /(\,u,v) *¥ 0. It is interesting to note that (81) cannot 
be satisfied with A = yu. It is found that F(A,u,v) is maximized, subject 
to (75) and (81), by 
VSG he fe aoe), ” (82) 
where 
cosé = —(3)3, (24/38 <6 <7), (83) 
and the corresponding value of P; , with P; + P2 = 0, is 
P,/(4L) = [1 — (4)4}? = 0.468. (84) 


4.2 An Upper Bound, and its Relationship to a Lower Bound 


In Appendix A we give the derivation of an upper bound, for a gen- 
eral voltage-charge relationship, on the maximum obtainable funda- 
mental power, using the fact that the charge waveform is of bounded 
variation, (31). It is shown that 


max P; S 4(1 + m)U, (85) 
where 


U = minf{ max [Ao — V(o)] ~_min [As — V(o)}}. (86) 


A —ms<o<1 


In the previous section we showed, by example, that 


max P; = we iy (87) 
where Z is defined by (73) and (74). From Appendix A, we have 
Ls (l+m)U/L 8 2, (88) 


and these bounds cannot be improved without restriction on the voltage- 
charge relationship. However, there is a large class of voltage-charge 
relationships for which the lower bound is attained, namely those for 
which [(p + m)V(1) — (m + 1)V(p) + (1 — p)V(—m)] does not 
change sign in —m S p S 1. From (85) and (87) it follows that 





ws, if L=A+m)U. (89) 
Also, for the above class, Z in (74) is given with o = 1,7 = —™m, or 


vice versa, and U in (86) is given with \ = [V(1) — V(—m)]/(1 +m). 
A class of voltage-charge relationships of interest is 


V(Q) = [max (0,Q)]’, -m 5S Q851,5 m20, »21, (90) 


POWER IN NONLINEAR CAPACITANCE DIODES 699 


of which we have already considered the cases y = 2, vy = 1, and vy = 
3 (with m = 0). It is readily seen that this class satisfies the above 
condition, and hence 


L= max {(p +m) — (1+ m) [max (0,9)]"}; 


T= Eo oo — [max (ov)? (91) 


nin, {ersEagy 7 Imex oT}. 


Thus, 
L=m+ (1 - ,) (1 +m)" = (1+m)U, (92) 


and the bounds on (max P,)/Z in (89) hold. From (69) and (92) it is 
seen that the lower bound is exact for the case »y = 1, m > 0. For vy = 
2 and m = 0, 30/3 L/2 = 0.385 as compared with max P, = 0.408. 
For vy = 2, L = (2m + 1)’/[4(1 + m)], and Table I shows the ratio 
2(max P,)/(3+/3L) for several values of m, and it is noted that the 
lower bound improves with increasing m. 


TABLE I—(y = 2) 


0 0.589 1.20 1.89 3.07 5.50 


m 
2(max P 
2tmax P| 1.058 | 1.042 | 1.027 | 1.018 | 1.012 | 1.006 


3\/3L 








V. THE MAXIMIZATION OF THE POWER TRANSFER FROM THE FUNDA- 
MENTAL TO SECOND HARMONIC, WITH BOUNDED CHARGE WAVEFORM 


5.1 The Canonical Representation of the Charge Waveform 


We wish to consider the problem of maximizing the power transfer 
from the fundamental to the second harmonic, when the charge wave- 
form contains no higher than second harmonics, so that 


Q(z) =a+bsna+ccosx +dsin 2x + e cos 2z. (93) 
We also impose the conditions 
Onas = fs Qmin = —m. (94) 


Note that it does not follow a priori that the maximum power transfer 
subject to (94) is equal to the maximum subject to Qmax S 1, Qmin 2 
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—m. We observe, however, that for the voltage-charge relationship 
(90), 
max [P; | Omax = 7, Qmin = —q| 
= ba max [Py | Qmax = 1; Qimin = —q/r]; << = Ls 

as may be deduced from (28) and (29). Thus it is sufficient to deter- 
mine max [P;| Qmax = 1, Qmin = —ml], that is, max P; subject to the 
conditions of (94), for a range of values of m. 

A canonical representation of Q(x) is found in Appendix B. In addi- 


tion to the two conditions in (94) it 1s supposed, by a suitable choice 
of time origin, that 


(95) 


Q(r) = Qnin = —™M. (96) 
Thus the five coefficients in (93) are given in terms of two parameters 
and it is found that 


a= [(c — e) — m); b = 2d = (1 + m)sy; 
c= (14+ m)3(1 — 8) — sy); (97) 
e=}(1+ my — 8) —4(14+8°)). 


The parameter s arises from the equation 


Q(2 tan” s) = Qmax = 1. (98) 
The parameter y is subject to the condition 
CS ys Les); (99) 


which of course also implies that s’ < 1. Thus we have a two-parameter 
canonical representation of Q(a), and these two parameters lie in a 
bounded region. Moreover, it is shown in Appendix B that, independ- 
ently of the voltage-charge relationship V = V(Q), 


Py {yao = 0; Pr [yna—et) = 0, (100) 


so that P, vanishes on the boundary of this region. Also it is seen, from 
(93) and (97), that changing the sign of s is equivalent to the trans- 
formation Q(2) — Q(2a7 — x), and hence 


Pi(—8,y) = —Pi(s,y); P, |-=0 = 0. (101) 
5.2 The Abrupt-Junction Diode 


We now consider the abrupt-junction diode operated in the region 
between forward conduction and reverse breakdown. From (28), (29), 
(37), (93) and (97), with m = 0, it follows that 


P, = —F all +8)yl(1 — 3’) — yl. (102) 
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The maximum of (102) subject to (99) is 
P, = eee 0.2814 (103) 
max 1 == 814/3 = ae ) 


being given by s = —(1/+/3), y = 4. Thus a charge waveform giving 
max P, is 


= 1 1 : 20 . 2a 
es ———_~ . a “6 
Q(a) 5 4- 3073 2 sin (« + 7) + sin (« aa 5 |, (104) 
and the corresponding fundamental reactive power is found to be 


_ An” 


fin = 30 


= 1462. (105) 
Q(x) = Q(x — (27n/3)) is depicted in Fig. 6. It is interesting to com- 
pare (52) and (103), and Figs. 1 and 6. We comment that the above 
results may be obtained quite elegantly, without using the canonical 
representation of the charge waveform. 


5.3 A Charge Waveform Which Provides an Approximation to the Mazxi- 
mum Power Transfer 


In Section IV we obtained a lower bound to the maximum obtainable 
fundamental power, (87), and it was seen to be a close bound in the 
particular cases considered. The charge waveform giving this lower 
bound is one which has values oc, p, and 7 on consecutive intervals of 


1.0 





0.8 





























Fig. 6 — Charge waveform for maximum power transfer from fundamental to 
second harmonic in abrupt-junction diode operated in the region between forward 
conduction and reverse breakdown. 
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x of length 27/3. Here oc, p, and 7 are those values which, subject to 
—m S&S (o,p,7) S 1, maximize L(c,,7), as defined in (73). It was also 
pointed out that if [(o + m)V(1) — (1 + m)V(p) + (1 — p)V(—m)], 
i.e., L(1,o,—m), does not change sign in —m S p S 1, then L(o,p,7) 
is maximized with o = 1, 7 = —™m (or vice versa) and a suitable value 
of p. The class of voltage-charge relationships given in (90) satisfies 
this condition and then 


p=((ltm)yv°, (106) 


Now the Fourier coefficients, up to the second harmonic as in(93), of 
the charge waveform giving the close lower bound to the maximum 
obtainable fundamental power, are 


So +p +7); b = 2d = 2 (« — 1); 


Qr 
oo a V3 


—2e = (ob e—i2p), 


a 
(107) 


Cc 


We will restrict ourselves to that class of voltage-charge relationships, 
V = V(Q), for which L(o,,7) in (73) attains its maximum, subject 
to —m S (o,p,7) S 1, when 


co = 1, T= -—™”M, —m<pc<l. (108) 


It would seem feasible that we might obtain a reasonable approxima- 
tion to the maximum power transfer from the fundamental to the second 
harmonic, by suitably shifting and expanding (or contracting) the 
above Fourier approximation, so. that (94) is satisfied. Setting o = 1, 
7 = —m in (107) and carrying out this procedure, we obtain the ap- 
proximating charge waveform 


wa (1 + m)(2sinz + sin 22) 


Q(x) =i (1+p—™m) + — 
(109) 


gis — m — 2p)(2 cos x — cos 22x). 


If we define Q(x) = Q[x + (2z/3)], then (96) is satisfied and in the 
canonical representation of Q(x), (97), we have 


Sa ao) 
V3 1 BO Fm)’ 


Tor the abrupt-junction diode operated in the region between forward 


conduction and reverse breakdown, p = 3, setting m = 0, v = 2 in 


(110) 
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(106). Hence, from (110), s = —(1/+/3) and y = 1/3, so that, from 
the previous section, the approximating. charge waveform is actually 
the one which gives the maximum power transfer. 


5.4 The Numerical Computation of the Maximum Power Transfer, for 
Particular Diodes 


We have already obtained a two-parameter canonical representation 
of the charge waveform containing no higher than second harmonics 
and satisfying (94). The two parameters s and y lie in the bounded 
region given by (99), and P, vanishes, independently of the voltage- 
charge relationship, on the boundary of this region. Also, since P, is 
antisymmetric in s, it is sufficient to consider only half the region and 
to maximize | P, |. The maximization was carried out numerically for 
particular diodes, by means of the iterative process of fitting a quadric 
surface. As a starting point s y in the process, that point correspond- 
ing to the approximating charge waveform, derived in the previous 
section, was used. 

The results of the numerical computations for the voltage-charge 
relationship of (90), with »v = 2 and v = 3, and several values of m, 
are tabulated below. Tables II and III give the values of the maximum 
power transfer, max P, , together with the corresponding values of the 


TaBLe II—(y = 2) 






































m max P, Ri Re Imax (62 + c2) (d2 + e?) 
0 0.2814 1.462 0.7310 0.7698 © 0.1482 0.0370 
4 0.7773 1.966 1.060 1.160 0.3289 0.0865 
1 1.284 2.300 1.300 1.549 0.5947 0.1573 
2 2.198 2.921 1.561 2.310 1.451 0.3484 
z 3.366 3.854 1.679 3.422 3.616 0.7414 
5 4.371 4.788 1.642 4.515 6.869 1.250 

7 5.544 6.020 1.474 5.951 12.92 2.097 

9 6.586 7.228 1.230 7.372 20.95 3.096 

TaBLe III—(v = 3) 

m max P, Ri Re Imax (b2 + ¢2) (d2 + 2) 
0 0.1623 1.514 0.7389 0.7684 0.1499 0.0366 
3 0.6782 2.137 1.182 1.162 0.3257 0.0878 
1 1.246 2.428 1.529 1.560 0.5635 0.1652 
2 2.271 3.0238 1.882 2.330 1.367 0.3682 
z 3.575 4.034 2.006 38.445 3.479 0.7703 
5 4.691 5.115 1.907 4.533 6.710 1.277 
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reactive powers in the fundamental and second harmonic, R, and R2 , 
and the maximum current Jmax associated with the charge waveform 
Q(x), that is the maximum value of | Q’(x) |. It is worth noting that 
RP; does not continue to increase with m. Also included are the squares 
of the amplitudes of the first and second harmonics in the charge wave- 
form, (b’ + c’) and (d’ + e’). These, together with the real and re- 
active powers, determine the normalized impedances. Tables IV and V 
give the values of —s and y which given maz P, , and also y® and P,", 
the value of P; corresponding to y"’ and —s” = 1/+/3 = 0.5774. It 
is interesting to note how close P;” is to max P; , except for the larger 
values of m. Table VI compares max P; with the maximum obtainable 
fundamental power, max P,, as obtained in Section III, for the case 
vy = 2 and several values of m. It is also worth comparing the value of 
max P, = 0.162 for the case vy = 3, m = 0 with the corresponding value 
of max P, = 0.408. 


TABLE IV—(v = 2) 















































m —s ya) y P,Q) max Pr, 
0 0.5774 0.3333 0.3333 0.2814 | 0.2814 
3 0.5839 0.2963 0.2942 0.7770 0.7773 
] 0.5848 0.2500 0.2426 1.283 1.284 
2 0.5716 0.1852 0.1782 2.192 2.198 
3 0.5465 0.1317 0.1301 3.307 3.366 
5 0.5246 0.1019 0.1046 4.199 4.371 
7 0.5008 0.0781 0.0844 5.197 5.544 
9 0.4816 0.0633 0.0717 6.047 6.586 
TABLE V—(v = 3) 
m —s yd y P,9) max Py 
0 0.5742 0.3704 0.3704 0.1622 0.1623 
3 0.5871 0.3566 0.3562 0.6775 0.6782 
1 0.5977 0.2963 0.2829 1.241 1.246 
2 0.5875 0.2112 0.1989 2.262 2.271 
3 0.5591 0.1449 0.1419 3.518 38.575 
5 0.5331 0.1097 0.1130 4.536 4.691 
TaBLE VI—(p = 2) 
mt 0 $ 1 2 z 5 
max Py 0.281 0.777 1.28 2.20 3.37 4.37 


max P; 0.687 1.83 3.02 5.50 9.33 13.15 
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5.5 On the Power Transfer From the Fundamental to the Third Harmonic 


Breitzer et al’ were also concerned with the abrupt-junction diode 
operated in the region between forward conduction and reverse break- 
down and considered charge waveforms containing no higher than third 
harmonics. They treated in detail the power transfer from the funda- 
mental to the third harmonic, subject to P2 = 0, and obtained a max- 
imum value of 


P, = (0.0242)x° = 0.238 = —P,, (111) 


making allowance for the difference in notation. This value of P; arose 
from two distinct charge waveforms. One was 


Q(x) = (0.5) + (0.310) sin x + (0.168) sin 2x + (0.155) sin 32, (112) 


and the other was quite close to this. We saw previously how by taking 
the Fourier approximation, containing up to second harmonics, of a 
charge waveform which gives a good lower bound for max P; subject 
only to restrictions on Qmax 2nd Qmin, and suitably shifting and ex- 
panding (or contracting) so that the restrictions on Qmax and Qmin are 
satisfied by the approximating charge waveform, we could obtain a 
good approximation to the maximum power transfer from the first to 
second harmonic, when no higher than second harmonics are allowed. 
In the case of the abrupt-junction diode operated in the region between 
forward conduction and reverse breakdown, which is the diode that 
we will consider in this section, it was found that the charge waveform 
so derived was precisely one that gives the maximum power transfer. 

Now, it is found that the best mean square approximation containing 
up to third harmonics, and subject to Pz = 0, to the charge waveform 
which gives the good lower bound to the maximum obtainable funda- 
mental power is 


aa) = [5+ 270 |, (113) 


where 
f(x) = [(0.4) sin a + (0.25) sin 2a + (0.2) sin 32]. (114) 
We shift and contract Q(x) by setting 


0G) = Ak +f), M = max{f(x)l, (148) 


so that Qmax = 1 and Qmin = 0. For this charge waveform, 
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= (0.0075)r/M* = —P;; P,=0. (116) 
It is found that 
M = 0.680; P, = 0.235, (117) 
and Q(x), as given by (114) and (115) is plotted in Fig. 7(a). The value 
of P; in (117) is very close to the maximum value obtained by Breitzer 


et al, (111), and it is interesting to compare Fig. 7(a) with Fig. 7(b) 
which depicts Q(x) as given by (112). 


; tt 
re ae 




















Fig. 7 — Charge waveforms giving (a) approximately, and (b) exactly, the 
maximum power transfer from fundamental to third harmonic in abrupt-junction 
diode operated in the region between forward conduction and reverse breakdown. 
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VI. THE MAXIMIZATION OF THE POWER TRANSFER FROM THE FUNDA- 
MENTAL TO THE SECOND HARMONIC, FOR THE CURRENT-LIMITED 
DIODE 


6.1 The Canonical Representation of the Charge Waveform 


We are concerned with charge waveforms as in (93) and impose the 
restrictions 
Qmax = 1; | Q’ laisse = k. (118) 
We observe that, for voltage-charge relationships of the form given by 
(90), 
max [P; | Quisx = D, | Q’ a = | 


1 (119) 


= 9° max | Pri Qaax = 1,10" lane = 4], O<pasl. 


In Appendix C we determine a canonical representation of Q(x), sub- 
ject to the conditions 


Oinae = 1; Quis < k; ree = —k, (120) 


by making use of the canonical representation obtained in Section 5.1, 
when the charge waveform has prescribed maximum and minimum 
values. Note that if Q(a) = Q(a — x), where Q(x) satisfies the con- 
ditions of (120), then 

Once = I Olnax = k; Oras = —k., (121) 


Irom Appendix C, the five coefficients in (93) are given in terms of 
two parameters s and y. It is found that 


b kw d kz ksy 


“Pia ataey 2 “Gea CC gy (122) 


ad 


where 
w= GQ —s) — sys 2 = Hyd — #) — 30. + 8°)’, (123) 
and that 

a = [1 — max (bsinz + ccosx + dsin 2x + ecos2x)], (124) 


which in general has to be determined numerically. The waveform is 
translated so that 
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O(n) = -—k= Qrin- (125) 
The parameter s arises from the equation 
Q'(2 tan’ s) = Qinax =k, (126) 


and the parameter y is subject to the condition 
O<yS3(1-8), (127) 
which of course also implies that s’ < 1. It is shown in Appendix C that 
P,=0 for y = 3(1 — 8’). (128) 


In order to maximize P, subject to (118), it is sufficient, in view of 
the correspondence between Q(x) and Q(x) = Q(mr — x) given by 
(120) and (121), to use the above canonical representation and to 
maximize | P, |. 


6.2 The Abrupt-Junction Diode 


We now consider the abrupt-junction diode operated in the region 
between forward conduction and reverse breakdown, for which the 
voltage-charge relationship is V(Q) = Q’,0 < Q < 1. We first maxi- 
mize | P,| subject to the conditions of (120), and suppose that k is 
sufficiently small that Qmin 2 0. Using the canonical representation ob- 
tained in the previous section, P; may be expressed in terms of s and y. 
Omitting the details, it is found that | P; | is maximized, subject to the 
restriction (127), for s° = 4, y = 0. The charge waveform giving this 
maximum is 





Q(x) = 1 + k[S(x) — Sas, (129) 
where 
sa. (4 sin x 2 sin 27) (130) 
It is readily verified that Siax = g = —Smin, where 
cae v3) = 0.734. (131) 


2+/2(3)' 
Thus Qmin = (1 — 29k), so that Qmin 2 O for 2gk < 1. This Q(x) ac- 


] 
Q(a — x) maxi- 


tually gives a negative value of P,, so that Q(z) 
mizes P, , and it is found that 
1 


= 0,681. (132) 
2g 


2 
max P; = oe k® = 0.7311k", for ks 


Fig. 8 depicts S(a — x). 
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Fig. 8 — Shifted and normalized charge waveform for maximum power trans- 
fer from fundamental to second harmonic in current-limited abrupt-junction 
diode, with maximum current less than a critical value. 


The fundamental reactive power corresponding to max P is 
Ree (1 — gh)ke. (133) 


But, for the voltage-charge relationship V(Q) = Q’, the addition of a 
constant to the charge waveform does not affect P,. Hence, if instead 
of requiring Qmax = 1 we just require 0 < Q(x) S 1, we have 


m Sr 


Ry = =p le = 8.78ak; gk <a (1 — gk). (134) 


6.3 The Optimum Operating Frequency 


So far, no discussion has been made of the angular frequency w of 
the actual periodicity of the charge waveform. We here consider this 
factor in the case of the abrupt-junction diode operated in the region 
between forward conduction and reverse breakdown. Now the physical 
limitation placed on the maximum current magnitude takes the form 


|Q’'(x)| s = (135) 


from (20). Also, the actual fundamental power 7, is, from (9), propor- 

tional to wP;. We thus consider the maximization of wP, as w varies, 

where the charge waveform Q(x), containing no higher than second 

harmonies, is subject to 0 < Q(x) < 1 and the condition in (135). We 

make use of results from Section V, as well as from the previous section. 
Thus, we define 


max [Pi | Qmax = 1, Qmin = —m] = P(m), (136) 
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and let the value of | Q’ |max for the charge waveform which gives P(m) 
be denoted by K(m). Then, remembering that the addition of a con- 
stant to the charge waveform does not affect P; , since V(Q) = Q’, we 
obtain from (95) and (103), 





An” 3 
P(m) = Sigg (137) 
and also, from (104), 
a _ 4 
K(m) = (1+ m)K(0) = 373 (1 + m). (138) 


Similarly, we define 
max [27 7.| Ovex S15-1:0" nas — 4) = Ch), (139) 

and let the value of Qmin for charge waveforms which give II(i) be de- 

noted by —J/(k). Then, from the previous section, 

Qn 


Il(k) = 7 


k’; M(k) = —(1 — 2gk), (140) 


where g is given by (131). 

Now if Q(a) is subject to just the restriction 0 < Q(x) S 1, then 
max P, = P(0), from (137). But, from (138), if (w/x) S (3+/3)/4 
then the Q(x) which give this value of max P, satisfy (135). Hence, 


(:) max P,; = 0.2814 (:), 0< () < 1.299. (141) 
K K K 


Note that if (w/x) > 1.299, then this gives an upper bound on 
(w/x) max P,. Also, if (w/«) > 1.299, then max P; = P(m) if K(m) = 
(c/w), and hence, from (137) and (138), 


2 
(2) max P, = 0.617 () () > 1,299. (142) 
K Ww K 


From (140), setting k = (x/w), we have 


\2 
(:) max P; = 0.731 (<) (:) > 1.468, (143) 
K w K 


and if 0 S (w/x) < (1.468) = 2g, then this provides an upper bound 
on (w/x) max P,. Also, if 0 S (w/x) < 2g, then max P,; = II[1/(2g)], 
from (140). Hence, 


(2) max P, => 0.231 (:), O< (2) < 1,468. (144) 
K K K 
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Tig. 9 shows (w/x) max P; as a function of (w/«). For 


1.299 < (:) < 1.468, (145) 
K 

the curve lies between the dashed lines. Thus, from the viewpoint of 

maximizing the actual fundamental real power, the optimum operating 

frequency, when the diode is not allowed to operate in the forward 

conduction region, lies in the range given by (145). Also, we can assert 

that 


ND 





bo 


, 2 3 2 

0.3655 = 2. < max (2) P| < 244)" _. 0.387. (146) 

7 K 81 . 

6.4 Maximization of the Power Transfer, When the Region of Operation 
Includes Forward Conduction 


In a previous section we obtained a canonical representation of a 
charge waveform Q(x), containing no higher than second harmonics, 
for which Qmax = 1, Qhrax < k and Qhain = —k. This canonical repre- 
sentation is given by (93), (122), (123) and (124), and involves two 
parameters s and y which lie in a bounded domain given by 0 S y < 
4(1 — s*). It was shown that, independently of the voltage-charge 
relationship, P} = 0 on y = 4(1 — s’). Moreover, it was seen that in 
order to maximize P, subject to Qmax = 1 and | Q’ |max = &, it is suffi- 
cient to consider this canonical representation and to maximize | P, |. 


0.4 


0.3 








9 
N 











(a/K) MAX P; 


0.1 




















) 0.4 0.8 1.2 1.6 2.0 2.4 2.8 3.2 3.6 4.0 
w/K 


Fig. 9 — Maximum power transfer from fundamental to second harmonic in 
current-limited abrupt-junction diode operated in the region between forward 
conduction and reverse breakdown, vs. frequency. 
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The maximization was carried out analytically for the abrupt-junction 
diode when k is sufficiently small that the diode does not operate in the 
forward conduction region. We treat here, by means of numerical com- 
putation, the abrupt-junction diode when partial operation in the for- 
ward conduction region takes place, the normalized voltage-charge 
relationship being given by (61). 

Again, the maximization process was that of fitting a quadrie surface, 
and this time it was also necessary to calculate a in (124) numerically. 
Iurther, it was desirable to first compute the value of P; over a rough 
grid, and then to pick appropriate values s® and y™, as a starting point 
in the maximization process. Thus, for several values of k, max | P, |, 
i.e., i(k) in the notation of (139), was computed in the manner de- 
scribed above. For the values of s and y which gave max | P, |, the cor- 
responding values of R, and R2 , the reactive powers in the fundamental 
and second harmonic, and of Qmin, 1.e., —M(zk) in the notation of the 
previous section, were calculated, together with (b” + c’) and (d’ + ¢’), 
the squares of the amplitudes of the first and second harmonics in the 
charge waveform. The results of the numerical computations are tabu- 
lated in Table VII. We note that the values of P: corresponding to the 
given values of s and y are negative. If Q(x) is the charge waveform 
corresponding to s and y, (93), (122), (123), and (124), then the posi- 
tive value of P;, that is II(/), is obtained from the charge waveform 
Q(x) = Q(x — 2), or any translation thereof. 

Now, from (119) with » = 2, and from (139), 


max [Pi | Oise = Pp, | Q’ laa = 1] => pil ); 0 < Pp < 1. (147) 


Tor Qmax S 0 we have P; = 0, from (61). We may write 


pn(Z) () a (=) ney (148) 


W(k) = Fal) (ke) ° 


The quantity k °II(k) is depicted in Fig. 10(a), and it is seen to be a 
nonincreasing function of k. It follows, from (147) and (148), since 
II(x&) is a strictly increasing function of k, that max P, subject to Qmax S 
1 and | Q’ |max S & is attained with Qmax = 1 and | Q’ (max = &. For 
k < 1/(2g) = 0.681, it can also be attained with 29k < Qmax < 1 and 
| Q’ |max = k. We comment that for the voltage-charge relationship 
V(Q) = max (0,Q), max P; subject to Qmax S 1 and | Q’ |max S k is 
not attained with Qmax = 1, for sufficiently small k, since in this case 
P, = Oif Qnin = O. 
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Let us now consider the frequency factor, as we did at the end of the 
previous section, so that (135) holds. Hence, setting k = (x/w). 


max i? P| = max Ea ‘ (149) 


The curve in Fig. 10(b) depicts II(k)/k and it is seen to be an increas- 
ing function of & in the range shown, although it is to be expected that 
it tends to zero as k > o. It appears that max [(k)/k] ~ 1, so that, 
from (146), a considerable improvement is obtained if the diode is per- 
mitted to operate in the forward conduction region. We must bear in 
mind, however, that we have idealized the voltage-charge relationship 
in the forward conduction region. 
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Fig. 10 — Maximum power transfer from fundamental to second harmonic 
divided by (a) the cube of the maximum current, and (b) the maximum current, 
for current-limited abrupt-junction diode with operation in forward conduction 
region permitted, vs. the maximum current. 
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TABLE VII 

k II(k) Ri Re M(k) 
0.75 0.3058 2.167 0.3033 0.0896 
1.0 0.6159 _ 2.440 0.5075 0.3980 
1.5 1.265 2.840 0.8274 1.027 
2.25 2.169 3.483 1.058 2.0138 
3.0 2.979 4.153 1.135 3.024 

k -s y (82 + c2) (d2 + e2) 
0.75 0.5988 0.0018 0.2404 0.0169 
1.0 0.6647 0.0288 0.3653 0.0393 
1.5 0.7035 0.0834 0.7159 0.1124 
2.25 0.7058 0.1354 1.613 0.2775 
3.0 0.6996 0.1663 3.006 0.5039 
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APPENDIX A 


From (28) and (29), for any \ (which we take to be real), 


P= (J {AQ(a) — VIQ(x)]} sin x a) ( is Q(x) cos x ax) 
— (s {rAQ(a) — VIQ(x)]} cos x dr) Ce Q(x) sin var) (150) 
= A [ {f\O(a) — VIQ(x)]} sin (a — 6) dx, 
where 


Qa 


A sin @ = Q(x) sin x dx; A cos 6 = | Q(x) cosadzx. (151) 
0 0 


Hence, 


Pe 


A = Q(x) cos (w — 6) dx. (152) 


POWER IN NONLINEAR CAPACITANCE DIODES 715 
Now, max P; = max | P, |. Since 
27 
| / f(a) sin (a — ¢) dx} S 2(max f — min f), (153) 
0 


(31), (150) and (152) lead to (85) and (86) in Section 4.2. We next 
derive the inequalities (88), where Z is defined by (73) and (74). Now, 


L2= max L(o,1,—m) 


= max [(1+ m)V(c) — (m+ a)V(1) + (¢6 — 1)V(—m)| 
er | (154) 
=m min TO Fe 
— [mV(1) + V(—m)]. 
Also, 
Le max L(1,p,-m) 


+ [mV(1) + V(—m)]. 
Hence, from (86), (154) and (155), 


2h 2 (l14+m)U. (156) 
Also, from (73) and (74), 
L= max  {(r—p)[\o — V(o)] + (o — r) [Ap — V(p)] 
—ms (o,p,7)<1 ( 157) 


+ (p — ofr — V(r)h}, 


for any (real) \. In view of the remarks preceding (74) we may assume 
either that -m SoS p78 il,orthat -ms7rSpSce £1, 
without loss of generality. In the former case 


(7 — p)[Ao — V(o)] + (o — 7) [Dp — V(e)] + (p — o) [Ar — V(r)] 


Se Sp) _max [hx = Vi) ae = ) ID [x — V(«)] 
+ (p— a) _max [Ak — V(x)] (158) 


sila ie) mae © [Ne Ve) min Neo i) 
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Hence 
(7 — p)[Ao — V(o)] + (o — 7) [Ap — V(p)] + (p — o) [Ar — V(7)] 


< (1+ m){ max [Ac — V(x)] - min [Ak — V(x)}}. (159) 


mses 


Equation (159) may be derived, in a similar manner, when —m S 
7 Xpso 1. Thus, from (157), 


Ls(1+ m){_ max [he — V(x)] — min D« — V(«)]}. (160) 


But this is true for all (real) \. Hence, from (86) 
Ls A1+m)U. (161) 


If we do not restrict the voltage-charge relationship then the bounds 
given by (156) and (161) cannot be improved. This is demonstrated 
by considering the (somewhat artificial) relationship 


1 Q= [Ud + m)a — mj; 
VQ) =9-1, Q= [1 — 1+ mal; (162) 
0, otherwise; m > —1, 0O<a <i. 
It may be verified that in this case 
L=(1+m) =(1—-a)(l+m)U. (163) 
We now find a class of voltage-charge relationships for which the 
bound in (161) is attained. If ¢ 2 7, then, by the definition of U in 
(86), 
(o-—7r)U = 8S max {[V(c) — V(r)]o — (o — 7) V(p)} 


—mspsl 


(164) 
—_min {[V(o) = V(r)}e = (o = 7)VCo)}. 
Let —-m S 7 So S1. Then, 
ES snes [tp st Volt (et) Vp) to ip). Ce) | 
= anes Ue) = VE )lp = Cee = a) V py} 
+ [oV(r) — 1rV(o)] 
=(o—7)U+ [eV(r) — rV(o)] (165) 
pee Ce — Vizjlp —-(e— 7)V (p)} 
=(o—7T)U 


tin [(e — 7)V(a) + (r= #)V(a) + (@ = a) V(r)h 
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Hence 
min [(e — 7)V(a) + (+ = 6)V(p) + (@ — a) V(z)] = 0, 
—msps 
and-—-mS7rSe08515L2(6-7)U. (166) 
Since, as may be readily deduced from (73), (74) and (86), 


L[-V)= LV]; U[-V] = UY), (167) 
it follows, by replacing V by —V in (166), that 
max Ue Vie ir — 2) Vo} + (o — pV) = 0,. (168) 


—msps 


and -mS7rSe0515>L2 (6 —7)U. 
In particular, we deduce from (166) and (168) that if 


[((o + m)V(1) — (m+ 1)V(p) + (1 — p)V(—m)] does not change 
sign in —m S p S11, then 
 L21+mUu>L=(14+m)U, (169) 


using (161). Note that, for this class, Z in (74) is given with o = 1, 
7 = —™m, or vice versa, and U in (86) is given with 


A= (VQ) — V(—m))/(1 + m). 
APPENDIX B 
We here determine the canonical form of 
Q(z) = a+ bsinaz + c¢ cosx + dsin 2x + € cos 2x. (170) 
such that 
Q(2 tan” s) = Qmax = 1; Q(t) = Quin = —m. (171) 
Now, (171) implies that 
Q’(2 tan* s) = 0 = Q’(n). (172) 
Substitution of (171) and (172) into (170) leads to 
a = [(c — e) — ml); B22, (173) 
and, after substitution for a and d, and some reductions, 
2e(1 +s’) — Ses’ = [C1 + m)(1 + 8°)? — 4bs]; 


: ; (174) 
cs(1 +s) + 4es(1 — s°) = B(1 — 38s°). 
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From the second equation above it follows that b = 0 if s = 0. Con- 
sequently, we introduce a new parameter y and set 


b= (1 + m)sy. (175) 
Solving (174) for ¢ and e, we then obtain 
c= (1+ m)[3(1 — 8°) — sy]; 
e= 4h) yO 8) = abe). 


Iiquations (173), (175) and (176) express the five coefficients in 
(170) in terms of the two parameters s and y, and the given quan- 
tity m. By satisfying (172) we have ensured only that « = a and 
x = 2 tan’ s are stationary points of Q(x). We now proceed to deter- 
mine the restrictions on the parameters s and y so that Q(7) = Qmin 
and Q(2 tan™’ s) = Qmax . We observe that in passing from a region in 
the (s,y)-plane in which Qmax = 1 and Qnin = —m to one in which both 
these conditions do not hold, either Q(x) has equal maxima or equal 
minima. We therefore determine the condition for Q(z) = 1 to have a 
second double root and the condition for Q(z) = —m to have a second 
double root. From (170), (173), (175) and (176), setting ¢ = tan (x/2), 
the condition Q(v) = 1 becomes, after some reduction, 


(t — s)*[? + Qst + (s*° + 2y)] = 0. (177) 


Hence, the condition that Q(a) = 1 should have a second double root 
is just 


(176) 


y = 0. (178) 
Similarly, the condition Q(2) = —m becomes, after some reduction, 
{20°[(1 + s°) — y] + 4syt + [1 — s*) — 2s’yJ}/ + &)* = 0, (179) 


of which ¢ = © is a double root. The condition for a second double 
root is 


y = (1—s*). (180) 


Thus, from (178), y = 0 separates the region Qmax > 1 from the re- 
gion Qmax = 1 and, from (180), y = (1 — s°) separates the region 
Qmin < —m from the region Qmin = —m. Now, on s = 0 we have, from 


(170), (173), (175) and (176), 
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Q(a/2) = [2(38 — 2y)(1 +m) — m) = [1-401 4+ 2y)(1+m)], (181) 
so that Qnax > 1 fors = 0,y < —} and Qnin < —mfors = 0,y > 3. 
Hence Qmax > 1 for y < 0, and Quin < —m for y > (1 — 8”). The re- 


gion of interest, ie., Qmax = 1 and Qnin = —™, iS given by 
OSys (1-8). (182) 
We next consider the fundamental power when the charge waveform 


Q(«) contains no higher than second harmonics. From (28), (29) and 
(170), 


Pi=fr [ V[Q(x)](b cos % — c sin x) dx. (183) 
0 


We determine conditions under which P; = 0, independently of the 
voltage-charge relationship V = V(Q). This is clearly the case if 
b = 0 = c, or if Q(x), as given by (170), is a single-valued function 
of (b sin x + c cos x), for then the integrand in (183) is the derivative 
of a periodic function. Noting that 


2(bsine + ccosa)”* = (b’ +c’) + 2be sin 2 + (c¢ — Bb’) cos 2x, (184) 
it follows from (170) that the latter condition holds if 
d=2\be; e = XAc' — Db’), (185) 
for some A. Combining this condition with 6 = 0 = ¢, 
2bcee + d(b’ — ce’) =0>P, =0. (186) 


We now consider the canonical representation of Q(x), with Qmax = 1 
and Qmin = —m, wherein the coefficients in (170) are given by (97). 
Then condition (186) becomes, upon reduction, 


=O, or y = (1 — 8°), or s=0O>P, = 0. (187) 


APPENDIX C 


We here determine the canonical form of Q(x), as given by (170), 
such that 


Qmax = 1;  Qmax Sk; Qmin = Q'(r) = —k. (188) 
_ Now, when the charge waveform Q(x) is subject to Qmax = 1 and 
Qmin = —™m, the five coefficients corresponding to those in (170) are 
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given in terms of two parameters s and y, from (97), by 


Qd = (1 + m)sy; 


I 


a= {[@—2)—m]; 6 


é= (1 + m)w; é= (1+ m)z, ony 
where 
w= BO ~ 8) — sys 2 = Hyd — #) — 40 + 8°). (199) 
The charge waveform is translated so that Q(a#) = —m, which may be 


done without loss of generality. The parameter s arises from the condi- 
tion Q(2 tan’ s) = 1, and the parameter y is subject to the condition 
0 <y< (1 —8’), which of course also implies s < 1. If, in addition, 


G = 0, then 
(1 + m)(w —- z) =m, (191) 
and hence, from (190), 
2y(1 + 38°) = [1 + 8°)(5 — 38°) — 8m/(1 + m)]. (192) 
Now 0 Sy S (1 — 8’), but if we require m = 1 then 
OSyS2(1-8), (m2 1). (193) 
Turning to a charge waveform Q(x), as given by (170), which satis- 
fies the conditions of (188), we may write 


Q(x) = = Alc), m =, (194) 


where (191) and (193) hold. Hence, 
Oa) = k[sy(sin « + 3 sin 2x) + w cos x + 2 cos 22] 








: 
aes (195) 
Integrating, and remembering that Qmnax = 1, 
Q(z) = {1 + k[S(z) — Smax}f, (196) 
where 
Sih. | w sin a> 5 sin 2x — sy(cosx + 3 cos 20) | (197) 


(w — 2) 


In general, Smax = max[S(a)] is determined numerically. 
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We now turn to the fundamental power, P; , when Q(2) has the above 
canonical representation. From (170), (186), (190), (196) and (197), 
we find that P,; = 0, independently of the voltage-charge relationship, 
if 


[((1 — 8’) — 2y]{(1 — &)(1 + 8s") — s[(1 — s*) — 2y]} = 0. (198) 


In view of (193), the second factor vanishes only if s’ = 1, y = 0. 
Hence we conclude that 


P,=0 for y = 4(1 — 8’). (199) 
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The Design and Analysis of Pattern 


Recognition Experiments 


By W. H. HIGHLEYMAN 
(Manuscript received March 2, 1961) 


A popular procedure for testing a pattern recognition machine is to 
present the machine with a set of patterns taken from the real world. The 
proportion of these patterns which are misrecognized or rejected is taken as 
the estimate of the error probability or rejection probability for the machine. 
In Part I, this testing procedure is discussed for the cases of unknown and 
known a priort probabilities of occurrence of the pattern classes. The differ- 
ences between the tests that should be made in the two cases are noted, and 
confidence intervals for the test results are indicated. These concepts are 
applied to various published pattern recognition results by determining the 
appropriate confidence interval for each result. 

In Part II, the problem of the optimum partitioning of a sample of fixed 
size between the design and test phases of a pattern recognition machine is 
discussed. One important nonparametric result is that the proportion of the 
total sample used for testing the machine should never be less than that 
proportion used for designing the machine, and in some cases should be a 
good deal more. 


PART I— ON ANALYSIS 


INTRODUCTION 


There are two distinct and consecutive processes usually involved in 
the feasibility study of a pattern recognition method or machine. The 
first process 1s the actual design of the machine. This might be based 
upon a set of sample patterns which the experimenter has gathered, 
from which he estimates the parameters of the machine. Alternatively, 
the experimenter may base his design on some a priort knowledge con- 
cerning the pertinent characteristics of the pattern classes under study. 
The second process is then the testing of this machine either in its hard- 
ware form or by its simulation on a general purpose computer. A differ- 
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ent set of sample patterns from that used in the design is used in this 
stage. 

The popular procedure for interpreting the test results is to take the 
proportion of patterns in the test data which have been misrecognized 
or rejected by the machine as the estimates of the error probability and 
rejection probability, respectively, for the machine. There are several 
questions which might be raised concerning this testing procedure, such 
as: 

1. Are these estimates the best estimates? 

2. If so, how good are these estimates? 

3. How does the estimate improve as the sample size is increased? 

Questions such as these are discussed in Part I of this paper. Two 
cases are considered; one is the case in which the a priori probabilities 
of class occurrence are unknown, and the other case assumes full knowl- 
edge of the a priori probabilities. 


Case 1. Unknown. a priori Probabilities — Random Sampling 


Let the number of allowable pattern classes be c. It will be assumed 
that, for each allowable class 7, there exists an a priort probability of 
occurrence w;, a probability of error e;, and a probability of rejection 
r;. (For the rest of this paper, the term “error” will refer to an unde- 
tected error; all detected errors will be assumed to be rejected.) These 
probabilities are unknown to the experimenter, who is interested in esti- 
mating the overall probability of error for the machine. 


eée= > wits , (1) 
i=1 


and the over-all probability of rejection, 


r= sor (2) 


Let him perform the following experiment, which will be called random 
sampling. Consider the patterns to be randomly generated by a “pattern 
source” according to the a priori probabilities of occurrence. He takes a 
pattern from the source, identifies it, and then lets his pattern recogni- 
tion machine attempt identification. He notes which of the three possible 
outcomes occurs: correct recognition, misrecognition, or rejection. This 
experiment is repeated n times, resulting in m, samples which have been 
misrecognized and m, samples which have been rejected. 

Since these outcomes are mutually exclusive, and each experiment 
independent, then the resulting random variables, m, and m,, clearly 


PATTERN RECOGNITION EXPERIMENTS 725 


are distributed according to the multinomial probability distribution. 
That is, the joint probability distribution of m, and m,, P(m, ,m,), is 
given by 


P(m,.,m,) = (mm."m,)e"r®™(1 =—¢=rn""™. (3) 


The maximum-likelihood estimates for e and r, denoted by é and */, are 
then’ 


I 
=|2 


(4) 


=|3 


f , 
which are the estimates in common use. Further, each of these estimates 
is proportional to a single random variable having a binomial distribu- 
tion; therefore, né and n? are themselves binomially distributed. The 
mean value of each estimate is the parameter for which it is an estimate; 
the variance of each is’ 
re ze pS e(1 — e) (6) 

n n 

of = hee) ) (7) 
n 

Because it is known that né and n7 are binomially distributed, con- 
fidence intervals can be applied to these estimates.* These confidence 
intervals require rather involved computations, but fortunately have 
been plotted for several values of n by various people.’* In Fig. 1 is 
shown such a plot of intervals for a 95 per cent confidence level computed 
by C.8. Clopper and E. 8. Pearson. The use of this graph is fairly simple. 
A vertical line extended upward from the observed value of the estimate 
given on the abscissa will intersect the pair of curves pertaining to the 
particular sample size used. Projecting these two intersections horizon- 
tally onto the ordinate axis gives an interval for the parameter being 
estimated. The probability is 0.95 that the interval drawn in this manner 
includes the parameter. For instance, if a sample size of n = 250 yielded 
50 errors, then the estimate of the probability of error is 0.20. Using 
Fig. 1 it can be stated that, with probability 0.95, the true probability 
of error is included in the interval from 0.15 to 0.27. 

* Mattson? has used a similar argument for determining convergence of an 
adaptive system. However, he used Tchebycheff’s inequality to obtain confidence 


intervals which are necessarily larger than if he had used such intervals pertaining 
to the binomial distribution. 
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Fig. 1 — 95 per cent confidence intervals for a binomially distributed variable. 


Case 2. Known a priort Probabilities — Selective Sampling 


It is now assumed that the a priort probability of occurrence for each 
class, w;, 1s known. To take advantage of this knowledge, the experi- 
menter takes n; samples from each class 2 such that 


Ni 


= 07, (8) 
n 


where 7 is the total number of samples. This process will be referred to 
as selective sampling.* (It will be assumed that the w; are such that (8) 
can be fulfilled with the desired sample size, n.) 





* This sort of sampling dichotomy has been previously noted by others. For 
instance, Bowley’ and Neyman® have referred to these two methods as ‘‘unre- 
stricted’? and “‘stratified’”’ sampling. 
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The machine is again allowed to attempt recognition of these patterns, 
resulting In m., samples from class 7 being misrecognized, and m,, sam- 
ples from class 7 being rejected. 

For any class 2, the joint probability distribution for m., and m,, again 

is multinomial: 
"rs. (9) 
Since each of these distributions is independent of the others in this ex- 
periment, then the joint probability of the outcome for all ¢ classes is 
the product of the individual probabilities (9): 


nN. vo S 
P( me; ,Mr;) = (m. ‘me, rte (1 — eo; — 7, )"* Me 


Pima, 5 +++, Mee Mery 22% Mee) 


c 


nN. ee = 
= (m., ‘me, ee rts (1 — &§ i. aaa 


t=1 


(10) 


This is no longer a multinomial probability distribution. However, since 
the maximum-likelihood estimate of a sum of independent variables is 
the sum of the maximum-likelihood estimates, then these estimates for 
e and r are 





, om (11) 
n ? 

aoe 2, Mr, (12) 

T= 7 ) 


which again agree with the popular practice of using the proportions as 
estimates. The random variables of which né and n? are values are not 
now binomially distributed, since a sum of binomially distributed vari- 
ables is not itself a binomial distribution in general. 

The mean of each estimate is again the particular parameter being 
estimated. The variance of each of these estimates can be computed: 


is le ls 
os? = re 2. Ome, = ne 2, miei] = e;) = . > wei(1 —¢;), (13) 


in which use of (8) is made, and the prime distinguishes this variance 
from that for random sampling. Similarly, 


on — 2 wr; (1 _ r;). (14) 
NT i=1 


It is of interest to compare these variances for selective sampling 
with those obtained for the case of random sampling. Since the variance 
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for ? has the same form as é in both cases, it is necessary to consider 
only one of them, say é. First note that o,’ can be written, using (1) and 
(6), as 


o = - (> wes) (1 — » vies) (15) 


nN \i=l 


From (13), 


n \i=1 


s pp 1 c 1 c 2 
0g — 0 =F Wie; T — Dd, wie; ; (16) 
N j=l 
Noting that >~$-1 #; = 1, (16) can be written as 


c Cc 2 c 
op — of = orn (« ~ > ones] =F wil —¢)? =o 20. (17) 


= 


Hence, the variance in the case of random sampling is greater than 
the variance in the case of selective sampling, the difference being what 
might be interpreted as the variance of the class errors. That is, if e; is 
treated as a random variable with probability distribution w,;, then 
oe is the variance of e;. (A similar derivation holds for the variance 
of the rejection probability estimates.) That the selective sampling 
variance should be smaller than the random sampling variance might 
be expected, since in selective sampling more information is used, namely 
the a priori probabilities. 

Although statements have been made concerning the mean and 
variance of the estimates in the selective sampling case, nothing has 
been said yet concerning confidence intervals. This is a much more 
complicated problem than that in the case of random sampling, since 
the estimates do not have a simple distribution function. In fact, the 
confidence intervals will in general depend on the particular set of 
e;’8 (or 7,78) pertaining to the machine, and not simply on e (or r). 

However, for small probabilities, the binomial distribution is quite 
closely approximated by the Poisson distribution, the fit becoming 
perfect as the probability approaches zero. For any reasonable recog- 
nition machine, one would expect the probabilities of error and rejec- 
tion to be small; consequently, the marginal form of (9) for m., or mr, 
may be approximated by a Poisson distribution. The estimates given 
by (11) and (12) are now sums of random variables with Poisson 
distributions (approximately) which are then themselves Poisson 
distributed. If the over-all error is also small, as is usually the case, the 
binomial-Poisson approximation can now be used in reverse, and one 
may state that, for small error rates, the error and rejection estimates 
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(11) and (12) are approximately binomially distributed. Consequently, 
one can use Tig. 1 to obtain 95 per cent confidence intervals for the 
error and rejection probabilities. Further, from (17), we would expect 
this confidence interval to be on the safe side, that is, the actual 95 
per cent confidence interval should be slightly smaller than this. 


APPLICATION TO PUBLISHED RESULTS 


To illustrate the ease of determining these confidence intervals, some 
published results in pattern recognition are listed in Table 1 along with 
the 95 per cent confidence intervals as determined from Fig. 1. It should 
be emphasized that Table I is not meant to compare one method against 
another, since the methods obviously treat problems of various com- 
plexities. Rather, the table is meant to compare the accuracies of the 
various evaluating experiments. 

Three points of caution should be noted concerning the validity of the 
confidence intervals in this table. First, the author is not positive that 
the test data is different from the design data in every case. Second, to 
the best of the author’s knowledge, in every case the number of samples 
taken from each allowable pattern class was predetermined. This is 
selective sampling; therefore, it is assumed that the proportion of samples 
taken from each class represents its a priort probability of occurrence. 
The third assumption is that the patterns used to test the machine are 
a reasonable sampling from the real-life world of patterns, and are not 
biased toward either well-formed or poorly-formed (noisy) patterns. 


CONCLUSION 


Two important cases concerning the testing of pattern recognition 
methods or machines have been considered: Random sampling for the 
case of unknown a priort probabilities of class occurrence, and selective 
sampling for the case of known a priort probabilities. The most pre- 
dominant form of testing in the present day art is to assume that the 
pattern classes have equal a priort probabilities of occurrence, and conse- 
quently to use equal sample sizes for each class; this is a special case of 
selective sampling. 

It has been shown that, for both cases, the maximum-likelihood esti- 
mate for the error probability or rejection probability is simply the 
proportion of samples misrecognized or rejected. In the case of random 
sampling, the estimates are binomially distributed, and accurate confi- 
dence intervals can be obtained. In the case of selective sampling, tighter 
estimates are obtained which are approximately binomially distributed 


0&2 


TABLE I —95 PER CENT CONFIDENCE INTERVALS FOR SOME PUBLISHED RESULTS 











































































































Author Pattern Classes Clare Rte eee |) Seer Vaeace mterv 
Baran, Estrin’? Machine Printed Num- | Presence of ink in elements | Maximize a posteriori prob- 480 9% | 7%-12% 
bers of 30 x 32 matrix ability (Bayes’ Equation) 
Bledsoe, Brown- | Hand-Printed Alpha-| Presence of mark in ele- | Matching 2-tuples of matrix 180 | 21.6% | 18%-29% 
ing$ Numerics ments of 10 x 15 matrix elements against table 
Bomba? Hand- Printed Alpha-| Topological features (ori- | Decision tree 112 0% | 0%-4% 
Numerics entation of straight lines, 
intersections, etc.) 
Doyle” Hand - Printed Simply measured topologi- | Maximize a posteriori prob- | ~450 | 12.5% | 10%-16% 
AEILMNORST cal features ability (Bayes’ Equation) 
Frishkopf™ Handwritten words Extremes, and interconnec- | Cross-correlation against 160 68% | 57%-77% 
tions between extremes dictionary 
Harmon! Unsegmented Hand- | Topological features (cusps, |Decision tree 412 | 41.1% | 37%-46% 
written Letters closures, special marks, 
etc.) 
Mathews, Denes!? | Spoken digits Frequency vs time spectra | Cross-correlation against 99 6% | 2%-12% 
previous averaged spectra 
from same speaker 
Marill, Green!8 Handwritten A,B,C | Distance of character from | Likelihood function assum- 90 38% | 1%-10% 
(done as example field edge along eight dif- ing normal distribution of 
only) ferent line segments measures 
Sebestyen!4 Spoken digits Frequency vs time spectra | Minimization of non-Eucli- 20 0% | 0%-18% 








dean distance measure to 
average spectra 
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for small error rates. Conservative confidence limits may then be ob- 
tained for these estimates. 

Using these notions, the experimenter can now determine the sample 
size required to obtain results which he deems significant. Alternatively, 
if he has a limited sample size, he can determine the significance of his 
results. Note that in both cases considered, the variance is inversely 
proportional to the sample size. This does not mean that the confidence 
interval is inversely proportional to the square root of the sample size, 
however, since a binomial rather than a normal distribution pertains. 
However, perusal of Fig. 1 seems to indicate that this is a good rule of 
thumb. Note also that the total number of samples required to obtain a 
certain confidence in the results seems to be independent of the number 
of allowable pattern classes. This is an interesting philosophical point 
to ponder. 


PART IT— ON DESIGN 


INTRODUCTION 


Part I of this paper was concerned with the estimation of the per- 
formance of a given pattern recognition machine. There it was shown 
how confidence intervals could be found for these estimates. These 
results are nonparametric in that they hold for any categorization 
machine (or procedure) regardless of its structure. 

We now consider the following problem. An experimenter desires to 
solve a particular pattern recognition problem. He has at his disposal a 
set of different methods for solving this problem, but it is not clear to 
him which is the best to use. Consequently, he desires to estimate the 
performance of each method when applied to this problem, and choose 
the best. Let us assume that each method is characterized by certain 
key parameters which, when known, completely determine the recogni- 
tion machine. To evaluate any particular recognition method, the experi- 
menter plans to design the corresponding machine by estimating its 
parameters on the basis of one sampling from the real world of patterns, 
and then to test this machine based on another sampling (either by 
constructing the machine or by simulating it). 

However, in many practical applications, the total sample size avail- 
able to the experimenter for design and test purposes is limited. For 
instance, he may be interested in building a machine to read hand- 
printed numbers, but he may not have an automatic scanner available 
to him. Since simulating a scanner by hand is very tedious, he may not 
be willing to scan more than a certain number of samples. 
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Or, he may be interested in distinguishing between radar returns 
caused by missiles and those caused by decoys. Since it is expensive to 
actually run the sort of experiment required to gather data for this 
problem, budget limitations will certainly place a limit on the number 
of available samples. 

Another example is in the field of automatic diagnosis of diseases. 
The experimenter may, for instance, be interested in building a machine 
which would determine the presence of cancer based on a list of symp- 
toms. However, records have been maintained for only a certain number 
of people who have contracted this disease, and the sample size is thus 
definitely limited. 

The following problem then arises. If the total sample size is fixed, 
what is the optimum partitioning of this sample between the design and 
test phases? This is a rather loose, but concise, statement of the problem. 
A more accurate one follows. 

Assume that the experimenter is concerned with the study of a par- 
ticular pattern recognition method as applied to some particular prob- 
lem. The optimum pattern recognition machine based upon this method 
would have an error probability e,. The experimenter is interested in 
estimating e, so that he can decide whether the particular method 
under study is adequate for the solution of his problem, or alternately 
whether it is better than another method. To do this, he takes a sample 
of a certain size ¢ from the real-life world of patterns. He desires to use 
part of this sample to design a machine according to the particular 
method under study. The machine which he thus designs will have an 
actual error probability e 2 e, (both quantities are unknown to the 
experimenter). He then uses the remaining part of his original sample 
to test the machine (according to the procedures of Part I). He thus 
obtains an estimate of e, which will be denoted by é. It will be shown 
that é@ is a biased estimate of e,, and that the bias can be computed. 
Consequently é can be adjusted so that it gives an unbiased estimate, 
é,, of e, . The optimum partitioning of the total sample will be defined as 
that partitioning which minimizes the variance of @. Thus, if the 
experimenter follows this procedure, he will obtain an unbiased minimum 
variance estimate of e, , the optimum error probability. Of course, if he 
finally decides that a particular method is applicable, he can then re- 
design the corresponding machine with the entire sample size. 


OPTIMUM SAMPLE PARTITIONING 
We are interested, then, in minimizing the quantity 


oe, = Ele — e)*] = Ele] —e, (18) 
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where E[z] and o,° denote the expected value and variance of 2, re- 
spectively. 

Let us first digress and consider the biased estimate é. Since é is 
discrete (it is the proportion of test samples misrecognized), its expected 
value can be written 


Elé) = >. ép(é), 


where the summation is over all values of é, and p(a) denotes the proba- 
bility of a. But 


pe) = | pleledple) ae, 


where p(é|¢) is the probability of é given e, and the integral is over all 
(continuous) values of e (by definition e, S e S 1). Hence 


ull = Xe { ple e)pledde = f (X ep(é|e)] vle)ae. 


Let us henceforth consider only the case of random sampling. Then é 
is proportional to a binomially distributed variable (né) with parameter 
e. Therefore the term in brackets, which is the expected value of é 
given the parameter e, is just e. Then 


Elé] = | ep(e)ae = Efe). (19) 


Efe] is a function only of the parameters of the problem and the design 
sample size; it is not a random variable. 

We next determine E[é’]. By going through a process analogous 
to the above, and by making use of (19), we obtain 


of = BUC — Hla)'] = Ble — (ate? = BEA = 9) 
where 7 is the size of the test sample. Hence 
ne = FEO 8) + Caiesy* (20) 


We now determine Ele]. Let the optimum machine be described by | 
c different parameters 6,;, 1 S 7 S c. The design of the machine con- 
sists of estimating the parameters 6,; by making measurements on a 
set of sample patterns (the design sample). Let the estimates of these 
parameters be denoted 6;, 1 < « S c. Then the error probability e 
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of the resulting machine is a function of the estimates of the true param- 
cters: 


e= e(8; ,02 Peal * 0c). 


One can now expand e in a Taylor series expansion about its minimum 
point, e,. Since this is a minimum point, all the coefficients of the 
linear terms will be zero. If the error deviation (e — e,) is small, terms 
above the second order term may be neglected: 


c 


Py ee ey pamlad ae 


Of gs i =. O57). 
4 £4 85300; 4 ( ) (6; i) 





The expected value of the error for the resulting machine is then 


Efe] =e, +4 apse tes 


j=l j=1 O 





6, L1(8; 601) (8; — 55) ]. 





If it is assumed that the estimates are unbiased, i.e., 4(6;) = 6; , then 
the above equation may be written as 


Eile] = eo +4 >» p» Qj j0%j (21) 


i=l j= 
where 
2 
Pp a oe 
ee — ae ——< —S 6 
o *  08;08; | °°’ 


o;; is the covariance of the estimates for 6,; and 6; , and o;; = a,’ is the 
variance of the estimate for 6,:. (21) is valid for small values of the 
quantity (e — @). 

It may be worth-while to digress here to a simple example which may 
help to clarify the definitions of the above terms. Zachary Oglethorpe 
is not only a crafty fisherman, but is also a good gadgeteer. He has 
decided to try to build equipment which will determine each day 
whether he should use a surface bait or a deep water bait in order to 
catch the maximum number of fish. He has means available to meas- 
ure the water temperature, the magnitude of surface ripple, and the 
atmosphcric pressure, and therefore decides to use these as his measure- 
ments. He denotes values of these measurements by m1, mz, and m; 
respectively. 

Mr. Oglethorpe has been recording values of these measurements 
every day for the past six months, and has noted on each day whether 
he was more successful with surface or deep water bait. He thus has a 
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total sample size of roughly 180 samples, some from one pattern class 
(surface bait), and some from the other pattern class (deep water bait). 
Since each sample was taken without a priort knowledge of the class 
to which it belonged, then this constitutes random sampling; that is, 
the proportion of samples in each class is an estimate of the a priori 
probability of occurrence of that class. 

Our crafty fisherman decides to build a decision making, or pattern 
recognition, machine by building a correlator for each of the two possi- 
ble decisions (or pattern classes). That is, the machine will make the 
following two calculations: 


Surface bait = 67m + d2m. + 6373 , 
Deep water bait = 64m, + 65me + 663. 


The class achieving the highest value represents the desired decision. 
Let us assume that, according to some theory, the optimum values of 
the 6; are the means for each measurement within the appropriate pat- 
tern class, normalized so that the sum of the squares of the coefficients 
of each linear form is unity. That is, 6; is proportional to the mean 
water temperature when surface bait should be used, and so forth, 
and is normalized with 6) and 63 so that 6° + 6: + 63 = 1 

Thus the parameters 6; completely characterize this pattern recogni- 
tion machine in that, given values for each 6;, 1 S 7 S 6, the machine 
may be built. The optimum values for each 6; are the appropriate nor- 
malized means, which are the 6,; of the previous equations. Mr. Ogle- 
thorpe obtains estimates of these optimum parameters by taking 
normalized averages over a portion of the appropriate data. These 
estimates are the 6; of the previous equations, and are the actual num- 
bers on which he would base the construction of his machine. Note that, 
in this case, these estimates are unbiased and efficient, and may very 
well be independent of each other (e.g., the probability distribution of 
the water temperature when surface bait should be used may be inde- 
pendent of the values of surface ripple magnitude and atmospheric 
pressure). 

Having thus designed his fisherman’s aid with a portion of his data, 
he now tests it with the remainder of the data to determine its accuracy. 
He does not want to use it if there is a good probability that it is less 
accurate than he has found his own intuition to be. This then leads 
us to the basic problem being studied: How should Zachary Oglethorpe 
split his total sample between the design and the testing of his machine 
to obtain the best estimate of the accuracy of the machine? Again, if 
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the estimated accuracy of his machine were sufficient, he would then 
be wise to redesign it, basing the new design on the entire sample. 

We now return to the study of this sample partitioning. Let each 
parameter be estimated with m samples.* If each of these estimates is 
an efficient and unbiased estimate, and if the estimates are independent 
(either because the estimates are statistically independent, or because 
different samples are used to estimate each), then all o;; = 0,7 ¥ J, 
and all o,’ will be proportional to 1/m. Hence one can rewrite (21) as 


Base (22) 
m 


where b is some constant calculated from (21). (Often, /[e] is in the 
form (22) even if the estimates are not independent. ) 

Let ¢ be the total sample size, and p be the number of sets of m sam- 
ples used to design the machine. p is chosen to be the smallest number 
which insures that Ze] is of the form (22). It is often simply the num- 
ber of allowable pattern classes, since, of course, parameters of different 
classes must be estimated with different samples. If n is the test sample 
size, then 


t= n+ pm. (23) 
T'rom (19) and (22), 


HA Sas ea. (24) 
m 


A 


Consequently, @ is a biased estimate of c,. The adjusted estimate é, , 
given by 


i= e- =, (25) 
m 


is an unbiased estimate of e, , with variance given by (18). This variance 
can now be rewritten using (25): 


b 2 
o, = E lé| —¢,” = al (@ — Sy — ey” 


2 
wie} - 24 mia + (4) eet 
m m 


* This is not always desirable, since some parameters may be easier to estimate 
than others, or there may be more data available for some parameters than 
others. However, this condition is assumed here for simplicity, as are the following 
assumptions of efficiency, unbiasedness, and independence. 
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Irom (20) and (24), 


no = Ble — e)I + (Bel)? — pitey. ne (2 nig, 
n m 
= Met Ol cate? - (e+ 2). 
n m 
Thus, from (24), 


o, = le — e)) (26) 
n 
If b/m <1 (which will certainly be true for any reasonable design), 


then 


(ee ates 





eo +2 


m= (1-G 


pb 
yeaa 17) 
n 


n 


where the relation (23) was used. 
We wish to choose n such that (27) is minimized. Differentiating 
(27) and equating to zero, one obtains 


No 
a Gad 


pb 7 m\®? 8) 
Gao) 


where n, is that value of n satisfying (28); it is the optimum test sample 
size in the sense previously discussed. n,/t is of course the proportion of 
the total sample used for the test. One interesting result is immediately 
obvious: n,/t must be greater than 0.5 for all cases. The equation (28) is 
plotted in lig. 2, from which the following general statements can be 
made. 
1. The proportion of the total sample that should be used to test 
the machine should never be less than 50 per cent. 
2. If et/pb < 0.1, then the proportion used for design should be 
about 50 per cent. 
3. The proportion of the total sample that should be used to test the 
machine becomes larger as: 
a. The total sample size increases, 
b. the error of the optimum machine increases, 
c. the effectiveness of the design increases (pb decreases). 
Here 1/pb is taken as a measure of the effectiveness of the design, 
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Tig. 2 — Optimum sample partitioning. 


since pb is the product of the expected deviation from optimum, Ele — e,], 
and the design sample size, pm. 

These results indicate just how a sample should be split between the 
design and test stages of a feasibility study of a pattern recognition 
method. If the experimenter follows this procedure, he will obtain an 
estimate é, of ¢, which is unbiased and has minimum variance. 

The value of this minimum variance can be expressed as 
| ee 


o; — Col = eo) fy gp aaa 


°min n 9 No x 1 
t 





which was obtained by eliminating pb between (27) and (28). Note that 
this is the variance that would have been obtained if the optimum 
machine were tested with n samples, increased by a factor which accounts 
for the design error. 


AN EXAMPLE OF OPTIMUM SAMPLE PARTITIONING 


As an illustration of these ideas, consider the following example 
(perhaps the simplest of the n-dimensional problems). A pattern recog- 
nition machine is to be designed using the optimum decision function’?"® 
which will distinguish between g classes. The occurrence of each class is 
equally probable a priorz, and all costs of misrecognition are the same. 
The receptor makes a set of k measurements m;, 1 S 7 < k, on each 
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input pattern. It is known that cach measurement is normally dis- 
tributed with variance o, and that all measurements are independent. 
lurther, it is known that the distances between the mean vectors in 
measurement space* are all equal. (Consequently, there can be no 
more than k + 1 pattern classes. The tips of the mean vectors are the 
vertices of a regular polytope. ) 

It can then be shown that the optimum decision function partitions 
the measurement space into polytopes which are bounded by those 
hyperplanes which are the perpendicular bisectors of the line segments 
joining all pairs of means. The hyperplane separating two classes, say 
classes 1 and 2, is the set of all points (2 ,---,a,), represented by the 
vector X, which satisfy 

&- (ii — Be) = 3th ihi — fee), (29) 
where ji; is the mean vector of class 7.” 

The design procedure consists of estimating each mean vector from 
a sampling; denote the estimated mean vector for class 7 by %;. The 
distribution of the estimate of a mean vector from a normal distribu- 
tion with covariance matrix [V] is also normal with covariance matrix 
1/m [V], where m is the sample size used in the estimate.’’ Since the 
measurements are independent in this case, then so will be the estimates 
of the means of the various measurements. IF'urthermore, each estimate 
will have a variance of o°/m. Consequently, only one set of samples of 
size m from each pattern class is required to insure that the form (22) is 
valid, and p is hence equal to the number of allowable pattern classes, q. 

It is shown in the Appendix that 6 is given by 


» — 14 —D Ap n(3*) 





4 20 20 


where Ay is the distance between any pair of mean vectors, and N( Ap/20) 
is the value of the standard normal density function for the variable 
Au/2z. The equation (28) then becomes 


No 
Aegt 2 ey I 


gq — 1) se (38) = (.-%) (30) 


* A geometric interpretation of categorization problems is often useful. By 
measurement space, we mean a k-dimensional space in which each coordinate 
represents one of the & receptor measurements. Thus any set of measurements 
which have been made on an input pattern may be represented as a point in 
measurement space. The decision function may be thought of as partitioning the 
measurement space into regions corresponding to the different allowable pattern 
classes and into rejection regions. 
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Fig. 3-— Optimum sample partitioning for symmetric Gaussian case. 


Some curves representing (30) are plotted in lig. 3 in which the pro- 
portion of the total sample to be used in the test, n,/t, is shown as a 
function of t, the total sample size, with the number of allowable pattern 
classes, g, aS a parameter. e, was held constant at 0.05 (which involves 
the choosing of the proper value of Au/2e for each q). 

¥rom Fig. 3 it is seen that, for many cases, the sample should be split 
evenly between design and test, as one might intuitively suspect. How- 
ever, there are some drastic deviations from this. For instance, if the 
categorizer is to separate only two classes, and 1000 samples are avail- 
able, then only 50 of these should be used to design the machine, and 
950 should be used to test it. Consequently, it is seen that intuition may 
go wrong in some cases. 


CONCLUSION 


This paper has begun an analysis of some of the problems which arise 
in the design and analysis of pattern recognition experiments. In Part IT, 
the problem of optimum sample partitioning between the design and 
test phases of a pattern recognition machine was investigated for the 
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case of a fixed total sample size and no overlap between the design and 
test samples. The general relation between the optimum partitioning 
and the total sample size, optimum error rate, and design efficiency was 
derived. From this, it was apparent that the test sample size should 
never be smaller than the design sample size. These results are non- 
parametric in the sense that they do not depend on the detailed structure 
of the recognition machine. It is only necessary that the deviation of the 
designed machine from the optimum machine be small, and that the 
design of the machine be done in such a way that (22) holds. 

However, the actual computation of the optimum sample partitioning 
does depend strongly on the detailed structure of the machine through 
the quantity b. Since this computation is quite difficult even in the 
simplest of cases, the interesting question arises as to the possibility of 
estimating b from the sample. Another interesting phase of this problem 
which has not been attacked here concerns the case when the design 
sample and test sample overlap — that is, some of the sample patterns 
from the design sample are also used in the test sample. In the limit, 
this reduces to using the total sample for both design and test purposes. 
In this case, the results of the test are usually not very reliable. Conse- 
quently, there may be some sample partitioning with overlap which is 
better (in the sense discussed in this paper) than for either the case of 
no overlap or the case of total overlap. 
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APPENDIX 


We determine here the coefficient b in (22) for the example discussed 
in this paper. If the mean vectors are more than about 3o apart, then 
only a small error is made if the total error is approximated by adding 
the errors of each hyperplane taken alone. That is, the integrals on the 
wrong side of the hyperplane that are counted more than once will be 
quite small compared to the integrals counted only once. 

Due to the symmetry of the problem, the error associated with each 
hyperplane for the optimum decision function is identical, and the deriva- 
tives of (21) will also be identical for each hyperplane. Since there are 
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q(q — 1)/2 hyperplanes, b may be expressed (from (21) and (22)) as 
b qq — 1) 1 k H 12 |g (31) 
By aa}? 


m 7 2 2 i=l 0x ie 
where the hyperplane separating classes 1 and 2 is taken as typical, 
and the independence of the estimates is used. ¢,2 is the error associated 
with this hyperplane, uy; and pe are the mean vectors of these classes, 
and #, and # are the estimates of the mean vectors. 
There is no loss in generality if 4, is taken as zero, and all the com- 
ponents of we (m2, °-* ee) are taken as zero except for py. . That is, 


MW = (0,0, Fs 0) 
M2 = (u,0, i 0), 


where py. is denoted np, uw > 0. Consequently, the optimum boundary is 
given by 


2 
0 é12 


H1+K2 Oxi? 














C= pe: 
A sampling of size m is taken from each class, and the mean vectors 
are estimated, giving 
HE, = (Fu a, +++ Fer) 


Bo = (Zio Fo, °° + , Fee). 


A boundary given by (29) is computed based on the above estimates, 
and this, together with the other estimated boundaries, determines 
the structure of the machine. 

The error e; associated with this particular boundary for class 1 is 


[ 1 1 (x;\’ 
q = —_- exp —~[—) dz; 
; U 0 \/2n0 " 5 (2) ’ 


a sos -3(2) a 
&1 (@).. 21K) V/ Qro 2\o ee 


where &(%2, -°+ ,v%) is the value of x, on the boundary, and is given by 
(from (29) ) 


: Xi — Xie 1< (Ze — Z;2') 
£1(ae, aoe, = 2) ee 
i=2 Li — Le 2721 fn — Fr 





os = k om os - 2 ar) 
= Yu + Xe _ 1 2(fa = Ein) X; = (Fa = Ei2 ) 
2 2 t= Xu — Le 
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-in(¢), 2SiK<n, 


where N(u/2c) is the value of the standard normal density function 
for the variate »/2c. In a like manner, 


_~_t _—#\)__t B “3 
an aN GH) --g¥() 25 


where é; is the error associated with this boundary for class 2. Since the 
total error for this boundary is é. = e; + e, then 
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A like result holds for 0°e., 2 < 7 S n. Going through this same 
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It would also be found that 


= a 
a ae v(Z). 


This analysis is perfectly general for arbitrary mean vectors. providing 
that u is merely interpreted as the distance between a pair of mean 
vectors (all such distances being assumed identical). This distance will 
henceforth be written Au to indicate that it is a difference of means. 
Therefore, from (31), we find that 


_ aq — 1) Ap (3) 
= 4 55 oY Noe : 
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Lined Waveguide* 


By H. G. UNGER 
(Manuscript received October 19, 1961) 


The eatsting approximate analysis of wave propagation in lined wave- 
guide is, under practical conditions, limited to linings thinner than 0.025 
per cent of the waveguide diameter. A more exact analysis is presented 
here for the straight and curved waveguide and for all practical linings. 
In the case of anisotropic or sandwiched linings, the boundary value problem 
ts formulated using wall impedances. The single isotropic lining is taken 
as an example to prove this formulation useful for typical cases. 

The exact analysis shows that neither the thickness nor the permittivity 
of the lining can increase the phase difference between TM, and TE 
beyond a certain limit. The curvature coupling between these two waves ts 
enhanced slightly by the lining. 


I. INTRODUCTION 


Round waveguide with dielectric lining shows promise as a communi- 
cation medium.’ The circular electric wave loss in perfectly straight 
and round metallic waveguide decreases steadily with frequency. Any 
deformations of the cross section or curvature of the guide axis degrade 
these ideal transmission characteristics. 

The TM; waves in particular are coupled by curvature to TEo, , 
and since they propagate with nearly equal phase velocity, there will 
be large mode conversion. A dielectric lining close to the wall changes 
the TM, waves appreciably with almost no change to Ton, . The phase 
velocities are now different and, despite curvature coupling, mode con- 
version stays small. 

When the lining is made lossy it will serve still another purpose.” 
Circular electric wave loss is increased only very little, while all other 
waves suffer an effective dielectric loss. This mode filtering loss will 
reduce the degrading effects of mode conversion and reconversion. 


* This work was performed under a Letter Contract with Bell Telephone Labo- 
ratories at the Institut fuer Hoechstfrequenztechnik, Technische Hochschule 
Braunschweig. 
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To serve both purposes best, the lining should be made anisotropic 
or sandwiched from different materials.’ The circular electric wave loss 
will then remain very low, yet all other waves will suffer high loss and 
also curvature loss will stay small. 

Wave propagation in straight and curved lined waveguide has been 
analyzed elsewhere.’” Likewise, imperfections in the lining and cross- 
sectional deformations have been calculated.** However, the lining was 
always assumed to be very thin, and so far only a first-order approxi- 
mation has been found. 

On the other hand, it has been shown both theoretically and experi- 
mentally that these approximations do not in general hold for any 
practical linings.’ Linings which are designed optimally change wave 
propagation much more than could be described by a first-order ap- 
proximation. 

An analysis of wave propagation in lined waveguide will be presented 
here which is sufficiently general and accurate to hold for all practical 
cases. Sandwiched and anisotropic linings will also be considered. 

Circular electric wave transmission is most strongly degraded in curved 
waveguide. Therefore, the lined waveguide will be assumed to have 
curvature. Cross-sectional deformations and imperfections of the 
lining will be analyzed with corresponding accuracy in another paper.” 


II. NORMAL MODE FIELDS 


Normal modes of straight round waveguide with a single isotropic 
lining have been analyzed before.’ To adapt this analysis to an investi- 





Fig. 1 — Lined waveguide with curvature. 
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gation of bends in lined waveguide and of waveguides with an irregu- 
lar lining, the boundary value problem will be repeated and generalized 
here. 

The waveguide structure to be considered is shown in Fig. 1. The 
dielectric lining will later on be assumed to be heterogeneous or aniso- 
tropic or irregular. At present, however, a single uniform and homo- 
geneous lining is assumed. 

The electromagnetic field in the waveguide can be derived from two 
sets of scalar functions 7’, and T,’ given by: 


Tr, = Ny J p(xar) sin pe 


. for O<r<a (1) 
T,! = Nu Jp(xar) cos pe 


and 


acy — cH, 
H,© (ke) — cH “OK 5 sin py 





2 
T, = N, Illa) = 


for -a-<r<.b. (2) 


p (xnt) — ¢Hy” (x01) 


2 
a Xn ip AXnt) 7 CAlp AXn?) 
Ly = Ny Xn cae ee T® (Ie e) _— V/H > (len e) COS Pe. 


The T functions satisfy the wave equation: 


2 = 1 2 OT e, 071 a ) 
a 1 | (° 27) T 3p av 2 (2 "| mae ces (3) 


in a general orthogonal curvilinear coordinate-system (u,v,w), in which 
the element of length is: 


ds’ = epdw + ed? + e; dw’. (4) 


The curved waveguide may be described in these coordinates when, 
according to Fig. 1: 


UuU=T, vD = @g, w =@ (5) 
a= A, = 1, eg=1+¢: (6) 


where 


COS ¢. (7) 


=) 
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The field components are written in terms of the field functions (1) and 


(2) 
i, = SV E + dn P| 
a €,0U €200 


OT, 
fy = 2V E - dn re | 











oT, Aor 8) 
Hy = ~ 2h EE dn Ie rs | ; 
oT, Aor 
H, = 2In | one + dn Te Te |. 


Maxwell’s equations are: 


1 


sail (ew) — 5 (eal )]= —jouol y (9) 
C23 

1 | o F) 
212 (aby - 2 (08)]= int, 0 

1 { 0 r) ~ 

ait E Saye )|- — jeouoll (11) 
Ee ies F) 
Calg |2 (esHw) — Ai (ell, )| = Joel, (12) 
1 6) 

ie (aH, ) kos 2 (este) | = Jweeoli, (13) 

C301 

1 | a r) 
C1€> | 2 (e:H,) = Ai (eit) | = que Ly . (14) 


Mo and e) are permeability and permittivity of free space. ¢ is the rela- 
tive permittivity of the respective cross-sectional part of the waveguide. 

Substituting from (8) into (11) and (14) and taking advantage of 
(3) the longitudinal field components are obtained: 


Hy = jucoBV dy iss 7. 
(15) 


2 
ae Xn 
Ky = Jemez re y= Ls 
where k = wv/ eeu is the intrinsic propagation constant of the medium 
in a particular cross-sectional part of the guide. e and k have constant 
but different values for the different cross-sectional parts of the guide. 
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In (8), this dependence on the cross-sectional part of the guide is 
not only true for ¢ and k, but until we learn more, also for d, . 

The quantities d, , c, c’ and the separation constants x, and x,° must 
be chosen so that the boundary conditions of the lined waveguide are 
satisfied. 

These boundary conditions are, at the surface of the lining: 


Lf, = Be (16) 
H,' = H,' (17) 
EL, = E, (18) 
H,’ = H,’ (19) 
and at the metal surface 
E,, = 0 (20) 
E, = 0. (21) 


The superscrips 7 and e indicate the internal and external field com- 
ponents at the surface of the lining. 
To satisfy (20): 


a Hy” (pkn’) 
° = FO (ples) ee 
where 
b 
p=- (23) 
a 
and 
ie = Xn 
ace (24) 
k, = Xn a. 
To satisfy (21) 
(2)’ e 
rd fees (pkn ) (25) 


~ H,®’ (pk?) ’ 


The prime at the Bessel functions denotes differentiation with respect 
to the argument. 

The condition of /,, being continuous across the surface of the lining 
is satisfied by virtue of the formulation of the T-functions in (1) and 
(2). To satisfy (17) 


dn’ = dy’. 
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d, is therefore independent of the cross-sectional area of the guide and 
needs no superscript. 
To satisfy (18) 


1 Knken® [Jee ke Che = ane) (08) 


dn,  pk?a’(e — 1) 


J p(Kn) Kn Hy? (kn?) — c'Hy™ (kn?) 





The remaining condition (19) leads to the following (characteristic) 
equation of the lined waveguide: 


Be _ be Hy” (ky) _ zee 
Tollin) Tey? Hy (lx?) — &'H,©(h,8) 














[gett _ bn Hy” (ke') — Hy Ae) | (97) 
dalthty) kn Hy? (kn?) — cHy™ (k,°) 
hia ka’ 
2 2 fln 
= Pp (€ = 1) h2k, 2 . 
The characteristic equation (27), together with 
Ken = (oo €ofto = hy’ )a’ 
2 (28) 


Kn (w'€€ouio — h,’)a’ 


determines the separation constants k, and &,,’. 
The transverse field components of any two different modes are or- 
thogonal to each other in that’ 


1 
Vialm 





| Gu x Hin) a8 
S 
aT. OT, \ (AT n lim OT mn. 
7 he | (2 Tey Ts.) (= + dn ke Te) (29) 


aT dT n\ (AT. lim OT m. 
Ce Og a Ee ee ae Gon 
a (4 uty (2 k? edu )| no 
The integration is to be extended over the entire cross section of the 
guide. The quantity 6, is the Kronecker delta. To satisfy (29) for 


nm = m requires N,, to have a certain value, which is to be determined 
from (29). 





III. GENERALIZED TELEGRAPHIST’S EQUATIONS FOR CURVED WAVEGUIDE 


All quantities in (8) and (15) have now been determined except the 
current and voltage coefficients J, and V,. To find relations for them, 
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(8) is substituted for the field components into Maxwell’s equations. 
Then 
- & 5 dle OLA 


a dm 2 oou =) e times (9) 


is added to 








2 arn ¢ 
m (2% d hn” OT m 


mF» i it 
e10U ke re) times C10) 


and the result is integrated over the cross section. Using the orthonor- 
mality condition (29), the wave equation (3), and the boundary con- 
ditions (16), (19), and (20), one obtains: 


h "y 2 2 
Te FI In = jomEl i gherm on 7 ds 
n 8 k? 
h,’ oT, \ (aT a) 
—= - EL n Saal — dan bACE m 
he | (2 k? Ts ) (oe k? edu (30) 


OT, ee lim OT m 
= (2%: + dn Te ie fe.) (2 ce "he e Te | is}. 


—€3 es + dn a) times (12) 
e€10u €0 














Similarly 


is added to 


—€3 (2 — dn dD fn) times (13) 
€200 €,0U 


and the result is integrated over the cross section: 





ane ile ee oe 
Ain + weVim = JumZV if te dnd 2 Paola OS 


- fof (0) (Bat) ww 
€300 " e10Uu €,0V e10U 
aT aT, \ (aT OT m. 
n bait vim mn m d ; 
a & aes Ts.) (2 Ae Te.) s} 


Equations (30) and (31) are the generalized telegraphist’s equations 
for the curved waveguide with a dielectric lining. 
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Introducing traveling waves 


1 (32) 
i= /K,, (Qn oe Bm) 


with 


the more convenient form in terms of amplitudes of forward (a) and 
backward (bm) traveling waves is obtained. 























dam 5 _ are = 
dw + NC = J2( Conn an + Cmn bn) 
(33) 
ee = ~—J2(Cmnt Vn -+ Cnn On) 
dw myYm = mn n mn n . 
The coupling coefficients in (33) are 
Cmn 
W” Ug€o / 2 (2 fa Le ae.) (2 Vn hn’ T=) 
= 2Vlmlhn Js ee eu oes ke edv] \edu ees Ke esdv 
aT h,’ oT ) (4 h,’ oT ) | 
gn ESN ell Oe a 
T (= k? edu/ \e.dv k? edu a 
| | (2 aT n o) (2 OT n ora) 
2 s se e10u aes e,0v / \edu oo 6.00 (34) 











/ 
+ (2 — a2) (2a — a, eV] as 
€200 €30U €200 €1,0U 
Co” Ug€o 


————— 2 Xn Xm 
eel XX" 7,1 dS 





To analyze circular electric wave propagation, it is sufficient to con- 
sider only coupling between circular electric and other waves. Let m 
denote the TE. wave; then 


Tm = 0 
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and (34) reduces to 














cana ts cua Ot: J ) OT mn. 
me OA/ Ant, Js Nendo "Kh? edu] edu 
lit, eae aT OT. \ Ol n, 
a mitbn dm { ee dy, : : 
+- 5 VSh h | " fe (om fn) €,0U aS oe 
dy, 





Ss obi TT as | 
WO" Loeo YS 

To find the z-dependence of the wave-amplitudes a, and b,, for cer- 
tain initial conditions, requires the solution of the generalized tele- 
graphist’s equations (33). 

They are a system of simultaneous first-order and linear differential 
equations and can be solved by standard methods. Irom this point of 
view, (33) with (35) and all the preceding definitions represent the 
formal solution to propagation of circular electric waves in round wave- 
guide with a single uniform lining. But this formal solution is still to be 
reduced to a practical form accessible for numerical evaluation. Also, 
heterogeneous or anisotropic linings have yet to be considered. 


Iv. A FIELD APPROXIMATION IN THE LINING 


Before proceeding any further, an approximation will be made, which 
is justified for all practical linings in round waveguide for circular elec- 
tric wave transmission. In all practical cases, the lining is thin compared 
to the radius of the guide 


pa Tea=a< 1, (36) 


Iurthermore, it can be seen from (35) that there is only coupling be- 
tween TE om waves and waves of first circumferential order, i.e. p = 1. 
Tor these waves it may be safely assumed that 


DP <Kx0r (37) 
for all 
asap ey 


within the lining, since | x,‘r | = | kn° | and, according to (28), 


, 22 hn 
kn = ka j«- fe. 


Under conditions (86) and (37), the wave functions (2) for the lin- 
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ing may be simplified, by replacing the Hankel-functions by their 
asymptotic expressions. The result is: 


kn? sin ( = ") Ke 
J p(Kn) 





Tr, = Nn 








a : sin py 
kn sin (p — 1)ky° 
(38) 
r e 
Ken? cos (. _ _ Ke 
Lo NG es J (Kn) “cos (p — 1)kne COS De. 
The characteristic equation may now also be simplified to: 
ze! . € ae 9 h2k?at 
(v, iene tan 6 i) (u, a kine cot 6 i’) = p (e 1) Ken’ Kin? (39) 
where 
Jp (kn) 
yn = 40 
ae SAC es 


Likewise, using (89) and the simplified form of (26), the factor d, may 
be written: 


2 2 
k, he" 


d, = Ae ae (s cot 6 ky® + in) : (41) 


The orthonormality condition (29) determines the normalization 
factor N, . Using the asymptotic expressions (38) for the field functions 
within the lining, the integration in (29) results in: 


9 2 ‘i p S. - 
=) (1 = 12 = kn Yn a 2) 





T ar 2) 272 { h 
5 N, fie a (kn) (1 +f- dn Ie 


\ 


7 D hn _ hy ékn4 

2 dr, k,2 (1 oe ) (1 i eke Ion’ (42) 
Kin? 26k,” + sin 26k, ‘og? hy’ 25k,’ — sin 26 k,° 4 

a sin? 6 k,,° "he cos? 6 k,¢ ae 





+ 





For circular electric waves with p = 0, the integration results in: 


2 
aN. i din Km J 0 (Km) vm [1 = es ie + 2Ym 
(43) 





Fim 26km —~ sin 26 Rm? | 1 
he 2 cos? 6 kin? ; 


+ 
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The asymptotic expressions (38) have also been used to calculate 
the coupling coefficients (35) for circular electric waves. Instead of 
Yn, another abbreviation 








_ J1(kn) 
tn = BT) i 
has been used to facilitate numerical evaluation: 
a T Kem d,,N rN. m a Vis 
Cman = 5 VW Rinhin eee ieee Jil ki) Jo(km) R (1 + ts) 1 + a. 
2 2 x : 
ae g Me (ea 2 Sltflsleg be +e [1 — 235 | vm 
"he? ee? — kn2 Sn " Kent — Ken? Ln 
k,? hege < k2 k ° tan k ‘ 6 
2 : en aS RO 
dnkin im F € bo ke Sas ( Km? tan Kye 6 
(45) 


e2 


é€ Ion é€ kn é 
—d,k, |1—- tan k,° 6 + d, tan k,, 6 


eka? lige 








2 e 2 2 
Xn Him Kin’ \ Berd? = Kim? ‘ 
+ d,, 7 (1 a he f,) ee _ ag km tan km 6 


) 
2 hin 2 Dun Xm 
+ (in +& h, i) ea h, (= I ‘ 


Equations (39) to (45) reduce the problem to as simple analytical 
expressions as seems to be possible at this time. Any further simplifica- 
tion would only be brought about by replacing the trigonometric func- 
tions and the Bessel-functions by their Taylor series expansions for 
small arguments, respectively arguments k,, close to the roots of Bessel- 
functions for the empty guide. Such simplification, however, would 
lead to a first-order approximation for very thin lining, which has been 
studied in detail elsewhere.’ 

The present aim is for a better approximation. Therefore, numerical 
methods starting with expressions (39) through (45) will have to do 
the rest. 

To this end the characteristic equation (39), which in implicit form 
determines the separation-constant k, , will first have to be solved. 
Tor a lossless or low-loss lining, the relative permittivity is real or may 
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be assumed real. Then all the quantities in (39) are real. An iterative 
procedure for this solution has been found earlier.’ 

For the sake of completeness it will be repeated here using the present 
symbols. Equation (39) may be written as: 


. FP 4 
cot dk, -Faa g/t) (46) 


fs 1 Ji (3 1) 2P hn kia "ka | Ele, (47) 


Kfy, € kent Ion?” € 





where 








A value for the relative permittivity ¢« of the lining is now specified. 
Tor a free-space propagation constant k = 27/\, a wave propagation 
constant h, is assumed and in calculating 

k 2 _— ka Sas h 2a" 
. . (48) 
iE = ka —h,’a 


for lack of knowing the true radius of the lining, a is replaced by b, 
the radius of the guide. Using (47), a first approximation for the rela- 
tive thickness is found. In general, according to the two signs of the 
square root in (46), there will be two such values of relative thickness 
which will lead to the same propagation constant h, . 

The first approximation for 6 is used to correct the radius a of the 
lining in (47) and (48). The calculation is then repeated. Since for 
small values of 6, a change in 6 affects the right-hand side only slightly, 
this method converges rapidly. 

Tor typical values of b/X and ¢ the phase constants of four normal 
modes have been plotted in Fig. 2.* The modes shown in Fig. 2 de- 
generate into TEy, TMy, TE. and TMi. when the lining is very thin. 
Of all the modes, these four are most strongly coupled to TE by curva- 
ture. The broken lines represent first-order approximations as they 
have been found earlier.’ Note that the first-order approximations 
hold only for extremely thin linings. In the case of the TH; wave, in 
particular, the lining should be less than 0.05 per cent of the wave- 
guide radius. Here the first-order approximations are of no use what- 
soever. 

Note also that the TM), phase constant does not increase as expected 
from the first-order approximation. The curve levels off, and eventually 


* These and most of the other numerical results have been obtained by H. P. 
Kindermann.® 
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Fig. 2 — Change in phase constant of normal modes in lined waveguide, b/\ = 
4.70, « = 2.5 — solid line exact; broken line approximate. 


a heavier lining will not change the TMy phase. According to Tig. 3, 
at higher frequencies the TMi phase levels off at even lower valucs. 
Also, a higher permittivity will not change these relations. 

This is, of course, very unfortunate since to reduce curvature losses 
the TM phase should differ most from the TE. phase. 

Having solved the characteristic equation, curvature coupling is 
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Fig. 3 — The phase constant of TMi is, over a wide range, nearly independent 
of 6 and e. 
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obtained by substituting numerical values into (45). For the TE 
characteristics first-order approximations may be substituted. Even a 
very heavy lining does not change these characteristics very much. For 
example, with 6 = 0.03 and e = 2.5 the relative change in phase con- 
stant of TE: according to this approximation is only: 

ABn —4 

Bn 2.10~. 

The coefficients of curvature coupling between TE, and TE, , TMi 
and TE}: are plotted in Fig. 4. Note again that any first-order approxi- 
mations hold only for extremely thin linings. 

The coupling between TE; and TM, is at first increased by the lining 
and then stays almost constant. The increase in TEo-7’My, coupling 
disagrees with another first-order approximation.* The present. result 
has to be considered correct, however, since the corresponding shielded 
helix waveguide curvature coupling is about equally enhanced.” 
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Fig. 4 — Coefficient of curvature coupling in lined waveguide b/A = 4.70, « = 
2.5 — solid line exact; broken line is wall impedance representation. 
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Curvature coupling between TE, and TE, decreases substantially 
in lined waveguide as is shown by the solid line in Fig. 4. The broken 
line will be referred to later. TEo1-TEy, coupling is nearly independent 
of the lining. 


Vv. A WALL IMPEDANCE REPRESENTATION 


The preceding analysis of lined waveguide considers only the simplest 
case, of a single isotropic and uniform lining. Yet this analysis could 
only be reduced to analytical expressions which are still quite involved 
and require a lot of computation for numerical evaluation. It is even 
more difficult to analyze a waveguide with a more complicated lining 
by the same methods. Further simplifications are necessary to facilitate 
the analysis of anisotropic or heterogeneous linings. 

Such simplifications are brought about when the effects of the lining 
are described by wall impedances which the lining presents to the 
waveguide interior. Looking in radial direction, two wall impedances 
may be defined which are associated with the two different polarizations 
of the fields: 


yo Ly : a E, : (49) 
er ee. ee tee (50) 

For a mode n represented by one term n of the series expansions (8) 
and (15), these wall impedances can be expressed by the field functions 
(38): 











Z Xa 1 
eS anecy 3 p ha? (51) 
cot jee 6 + d;, ie,.? ike 
Z, = qe | tan i es i (52) 
Nae Ankn® 
For circular symmetric modes p = 0 and 
a Xn e 
Zw = J tan k,° 6 (53) 
WEEQ 
Zo = ga tan k,,° 6. (54) 
Xn° 


The characteristic equation can be derived with these wall impedances. 
Instead of satisfying the boundary conditions (16) through (19), the 
two conditions (49) and (50) will now be substituted. The ratio 
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—(E,,/H,) and E,/H,, of the field components in the waveguide inte- 
rior adjacent to the lining will be required to be equal to the wall im- 
pedances Z,, and Z, presented by the lining: 





Ey 1 
i, 7 wey a — dy, DP hy a (55) 
joe ke ke 
i ka? p ) 
- = j— (% — >) =D. 5 
Hoe! gee ” i? ds 698) 


After the factor d, has been eliminated from (55) and (56) one equa- 
tion, the characteristic equation, remains: 


j . Zy \ oe : 
(u, = 7) (u, 2) =  Atyt Ie (37) 


This equation is still exact to the same order as are the wall impedances. 
If expressions like (51) and (52) for the wall impedance were sub- 
stituted, the same equation as before, that is (39), would result. 

Instead of (51) and (52) wall impedance values for circular symmetric 
modes as given by (53) and (54) will now be substituted into (57). 
But for the rest, the circumferential index p will be kept general in 
(57). One obtains: 


_ it € . p hn : 
(u, me tan 6 is) (u, + ae cot 6 ke) = ok (58) 


This expression is quite similar to (39). 
The left-hand sides of (58) and (39) are identical, and there is only 
a small difference on the right-hand sides of these equations. The right- 
hand side of (389), for example, may be written as: 
hep? ktat(e — 1)? = =h,2p? (e — 1)? 


kk oe -3e ( ny (59) 





a 

All modes of interest are those which have nearly the same propaga- 
tion constant h, as the circular electric wave. Since under practical 
circumstances the circular electric wave propagates nearly as in free 
space, A, will also be nearly equal to k. If under these circumstances the 
right-hand side of (39) according to (59) is compared with the right- 
hand side of (58), the difference is found to be very small indeed. 

To determine the normal modes in a straight waveguide with a single 
dielectric lining, it therefore seems well justified to use (58) as charac- 
teristic equation instead of (39). 
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As further confirmation, this approximation (58) has been solved 
numerically and compared with solutions of (39). 

‘Let 6’ be the thickness of the lining. Solving (58) will give a certain 
phase constant for a particular mode. To obtain the same phase constant 
from (39), the thickness has to be 6. Plotted in Fig. 5 for the three modes 
is the relative error (6’ — 6)/6 that results from using (58) instead of 
the exact form (39). 

An analytic expression for this error can be given for a very thin 
lining, when the modes in lined waveguide may be considered first-order 
perturbations of modes in metallic waveguide. Under these conditions 
for TM modes 





and for TEpn modes 


8-5 kno [2 - | 
5 ~——s (e — 1) Fa? ka? 


where J,’(kno) = 0. 

In the numerical example of Fig. 5 the error is largest for the TE), 
mode and very thin lining. For the other two modes the error stays 
generally below 1 per cent. 

The wall impedance representation, therefore, holds for single iso- 
tropic linings of any practical dimensions. 
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¥ig. 5 — For the same phase constant, the thickness of the lining is 6 according 
to (389) and 6’ according to (58), b/A = 4. 70, « = 2.5. 
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The characteristic equation (58) for the waveguide with a single 
isotropic lining is not the only form to which the more general expres- 
sion (57) can be reduced. It can be utilized in much more general cases. 
As long as we are able to determine the wall impedance Z,, and Z, , we 
can use (57) to determine the normal modes of the structure for any 
linmg such as anisotropic lining or heterogeneous linings. 

The example of a single isotropic lining has taught us that it is suffi- 
cient to use wall impedance values of the corresponding circular sym- 
metric modes. It is relatively easy to find these wall impedances even 
for quite complicated jacket structures. We will then be able to determine 
the normal modes characteristics of waveguides with such complicated 
jacket structures. 

To make full use of the wall impedance representation in our analysis 
of the curved waveguide, some approximations are necessary for the 
coefficient of curvature coupling (35) and the normalization factor 
Vics 

To obtain these two quantities, products of field components and other 
functions had to be integrated over the total cross section. The range of 
integration included the lining. 

In our present representation we do not explicitly determine the field- 
distributions within the lining, but the effect of the lining is taken into 
account by only considering the input impedances as seen from the 
waveguide interior. In this representation we therefore cannot calculate 
the contributions to the various integrals by extending them over the 
lining. We will consider the effect of neglecting these contributions. 

Under practical circumstances, the area of the lining is always very 
much smaller than the total cross section. The components of the mag- 
netic field, since they are continuous across a dielectric boundary, are of 
the same order of magnitude within the lining as in the empty space of 
the waveguide. Except for a possible change of the order e, the same is 
true for the components of the electric field. 

In summary, then, the integrals of products of field components over 
the area of the lining are always very much smaller than the correspond- 
ing integrals over the total cross section. They consequently might be 
neglected. 

Under these circumstances, (42) reduces to 


T 2 2 hy 
5 Nu Ten? Sp? (In) TC ae a, fe) 


2 2 
ie Be ae) aoe hn \ qg P| _ 
( ja 1 2 Un kn ue) 2 (14 a) dy al ] 
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and (43) for circular electric waves to: 
2 
t Ny dm Km Jo (km) om (1 + Qym Kn? tn) = 1. (61) 


For the coefficient of curvature coupling we get instead of (45) 
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The factor d, in all these equations can be determined from (55) or 
(56). lor example from (55) we get: 


Seas, ) 
 D Aa2 a weAZw) 


—\1+d, 





After the characteristic equation (57) has been solved for a particular 
combination of wall impedance values Z,, and Z, , all the other quanti- 
ties and eventually the coefficient of curvature coupling ¢mn~ can be 
found by straightforward evaluation of (60) to (63). The wall imped- 
ances Z, and Z, may of course be determined from the circular sym- 
metric field components. 

The approximations which have been made to obtain (60) to (63) 
have been examined more closely by numerical evaluation. The coeffi- 
cient of curvature coupling has been calculated using these equations 
and compared with the plots in Fig. 4. For TMy-TEq, and TEy.-TEo 
coupling the differences are small enough not to show in Fig. 4. The wall 
impedance representation fails only for TEHo-TEi; coupling and 6 > 0.8 
per cent. The corresponding coefficient of curvature coupling is shown 
by the broken line in Fig. 4. Fortunately the coupling is then so small 
and the phase constants of the waves differ so much that there is no 
significant interaction between TE and TE, . 
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The explanation of why the two methods of calculation result in 
different values in just this case is as follows: 

The TE, wave in round waveguide is most strongly modified by the 
lining. Even in a thin lining, the TE, field tends to be concentrated 
within the lining. In the present numerical example, it takes only a 
relative thickness of 6 = 0.4 per cent for the radial propagation constant 
k, to become imaginary and consequently the TE, fields to be evanes- 
cent towards the center of the guide. When this happens because 
of very weak TI;, fields within the guide, the curvature coupling to 
TE will be very weak too. 

The wall impedance representation fails for calculating the coupling, 
because it entirely neglects any field interaction within the lining, 
which is more and more significant for TE,,; and a thick lining. This 
phenomenon is limited to TEy; ; coupling to all other modes is accurately 
described by the wall impedance representation. 


VI. WALL IMPEDANCE OF ANISOTROPIC AND HETEROGENEOUS LININGS 


It has now been established that the wall impedance representation 
is a useful method in analyzing wave propagation in straight and curved 
sections of lined waveguide. To use this method for waveguides with 
anisotropic or heterogeneous linings we need to know the wall imped- 
ances of these linings. 


6.1 Anisotropic Lining 


Flock coating shows promise as a lining for circular electric wave- 
guide. Resistive fibers of the flock are parallel to the electric field of 
unwanted modes but perpendicular to circular electric fields. A flock coat 
is anisotropic, and in an (2,y,2) system identified by 


1) a ae av = Y, w= 2 (64) 


it may be described by the permittivity tensor 


er O O 
lle || =|]0 « OTf. (65) 
0 0 « 


Wall impedances of circular symmetric modes are used in the wall 
impedance representation. In our present system of coordinates, cir- 
cular symmetry corresponds to 0/dy = 0. 
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—jhz 


Assuming furthermore a z-dependence of ¢ **, Maxwell’s equations 


may be written in the following form: 


jh, = jopH, 














; OL, ; 
—jh lh, — aa —jopll, 
aL, : 
eee ee 
ee (66) 
jhH, = jwe LE, 
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—jhH,— Aes jwe Ly 
ans = jwe L,. 
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Eliminating the field components I’, , #, and H,, H, from (66) we 
get: 


on, 





+ (wpe: — WV’) 2 BE, = 0 (67) 
Ox €x 
OH, 
Ox? + (ws Mb é€z — h’) H, = 0. (68 ) 


The general solution of these equations has the form 


Ae*** + Be ?* 


€ € 
Sat ten AA 
€z Ex 


x= x= Voie =F 


are the propagation constants in 2- or radial direction. 
A wave traveling in positive z-direction or outwardly in the cylindrical 
system is represented by the second term. For such waves 


where for (67) 


and for (68) 


Oe nae 
aaa, 
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and we obtain from Maxwell’s equations and (67) 


jwe LE, = ~jxe 4/2 My 
Ex 





‘ (69) 
Le vz 
and from (68) 
—jou H, = —j xz by 
) ead 0) 
Xz 


Z, and Z, are the wave impedances of an anisotropic medium. The wall 
impedances of the anisotropic lining are input impedances of a radial 
transmission line of length (b — a) short-circuited at the end. In our 
present approximation: 





Lw = J ON EE. tan Xx if (b a) (71) 
Dy = he tan x. (b — a). (72) 


To make Z, and Z, constants of the waveguide, independent of a par- 
ticular mode, we consider only modes which are sufficiently far from 
cutoff to propagate nearly as in free space. Then 





_ 2 €x 





Xz X = —1 
(73) 

_ oF Boy 

Xz i Pm : 


Tor circular electric waves only Z, enters the boundary condition. 
Note that in (72) Z, is independent of ¢,. Resistive components in 
the flock coat will cause a loss factor only of «,. Such resistive com- 
ponents leave the circular electric wave loss unaffected. 


6.2 Double Lining 


A base layer of dissipative material and a top layer of low-loss ma- 
terial provide mode filtering for TE transmission and reduce TE, loss 
in bends.” 

Let the base lining have a relative permittivity ¢ and thickness b — a, 
the top lining e; and a, — a. 
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Input impedances of the base lining are 





Z A= ee tan xi(b — a) 
; X1 


where 


Xi = k Ve 1. 


These input impedances are transformed by the second lining accord- 
ing to ordinary transmission line theory 


wo Ver — laVe — tang +6 Ve — 1 tang: (74) 








Lig =F 
€0 et € VWe—1 1-«& A/G <4 — 1 tang tan ¢; 
Fin Bo 1 /e,— 1 tang + We — 1 tang, (75) 
€0 VJ: —]1 V6 =< a= V €: = 1 tan gp tan ¢; 
where 
b-—a 
Yo = 2Qr & — 1 x ; 
and 





v= We Ve—1- - 


For a thin base layer g « 1 and 


ta 
= (b — 4) + 
Ly =i te Nt (76) 


€ eee 
: ~ = (b = a1) Va =i tangs 





Note that in (76) Z, is independent of the permittivity in the base 
layer. Any loss in the base layer will not significantly raise circular 
electric wave loss. 


VII. CONCLUSION 


In previous first-order approximations, normal modes of lined wave- 
guide were considered perturbed modes of plain waveguide, and coeffi- 
cients of curvature couplings were assumed the same as in metallic 
waveguide. In some respects these approximations hold only for ex- 
tremely thin linings, thinner for example than 0.05 per cent of the 
waveguide radius. The present more exact analysis shows that the 
TE, wave has a phase constant much higher than would be expected 
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from these approximations. Neither the thickness nor the permittivity 
of the lining can increase the phase difference between TM, and TE 
beyond a certain limit. The phase difference eventually is almost inde- 
pendent of 6 and ¢ and is small for high frequency. . 

Curvature coupling between TE, and To; is substantially smaller 
in lined waveguide than in plain waveguide, while it is nearly inde- 
pendent of the lining between TIy. and TE, . Between TM,, and TE , 
however, it first increases and then stays constant. 

Waveguides with sandwiched or anisotropic linings may be analyzed 
by using a wall impedance representation. Wall impedances which the 
lining presents to fields of circular symmetry may be used in this analy- 
sis. They may easily be calculated for flock coatings and double linings. 
The wall impedance representation is found to be accurate for all typi- 
cal cases. 
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Some Traffic Characteristics of 
Communications Networks with 
Automatic Alternate Routing 


By J. H. WEBER 
(Manuscript received August 238, 1961) 


As a first step in the investigation of communications networks with 
automatic alternate routing, a simulator has been prepared using the IBM 
7090 high-speed digital computer. The simulator is capable of being applied 
to a large class of networks, the principal restrictions being that blocked 
calls are cleared, and no congestion or delay is encountered at the switching 
points. Although the first version of the simulation program requires that 
the alternate routing plan be fixed in advance (1.e., before a run), the program 
design ts such that traffic-dependent alternate routing doctrines can easily be 
provided. 

The simulator has so far been used to examine the behavior of small 
networks of various sizes, configurations, and alternate routing doctrines 
under normal and abnormal conditions of load. Several criteria are tntro- 
duced and used to evaluate the relative performance of different networks, 
leading to conclustons regarding the merits of certain alternate routing 
procedures and the areas of profitable application of the networks studied. 
The overload capabilities of these networks are of particular interest and are 
examined in some detail. 


I. INTRODUCTION 


The recent rapid expansion of long-distance communications facilities 
to serve increasing civilian and military demands, along with the evolu- 
tion of cheaper trunking facilities and more sophisticated switching 
techniques, continues to bring the problem of network design and engi- 
neering to the attention of communications engineers. Although methods 
have been developed for engineering certain types of networks for the 
most economical distribution of trunking facilities, several critical prob- 
lems remain. 
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One of these is the lack of understanding of the behavior of alternate 
routing networks under overload conditions, whether the overload is 
local or system wide. Local disasters, such as storms, earthquakes, etc., 
have caused severe deterioration of service in certain regions owing to 
increased loads directed toward the affected area. At other times, such 
as Christmas Day in the United States, the pattern of traffic shifts 
radically, again causing serious overloads and long delays in completing 
calls. Finally, some concern is felt for the behavior of the system under 
the impact of some widespread disaster, where overloads may appear 
everywhere simultaneously. 

Such considerations lead in turn to two questions. First, how shall 
networks be designed to be efficient during normal operation and yet 
not deteriorate catastrophically under overloads, and second, given a 
network design, can the switching pattern be altered for the duration of 
an overload to improve performance, and if so, how? 

Another problem is our present inability to engineer any but the 
limited class of alternate routing networks of a “hierarchical”? nature 
which have been widely used in the Bell System and elsewhere. 

Since no analytic techniques appeared to be available or soon forth- 
coming to answer these questions, a simulation study was undertaken in 
the hope that some insights might be provided into the operation of such 
networks which would be helpful in their design and in the development 
of theoretical models to predict their characteristics. 

Accordingly, a program was written for the IBM 7090 computer 
which enables various alternate routing philosophies to be simulated and 
compared. In line with the general nature of the problem being studied, 
the program was designed to accept a large variety of networks and be 
easily expandable to encompass more sophisticated alternate routing 
procedures as they evolve. 

The capabilities and limitations of the simulator are outlined in some 
detail in the next section, followed by a description of the first experi- 
ment using the program. Finally the results are presented and analyzed, 
and some general characteristics of alternate routing networks of the 
types studied are set forth. 


II. SIMULATOR CHARACTERISTICS 


Although many of the problems which arise when alternate routing 
networks are overloaded are caused by switching delays and shortages, 
it was decided, as a first step, to consider only the effects of trunking, 
since the switching problems are unique to particular systems, and would 
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in any event considerably complicate both the program and interpreta- 
tion of the results. Accordingly, the program was constructed under the 
following restrictions: 

(1) No blocking or delay is introduced by any switching point. 

(2) Calls which do not receive service immediately are cleared from 
the system and do not return. (If setup time is assumed negligible, 
and there are no delays, the lack of retrials is not likely to mate- 
rially affect the nature of the results.) 

If the network is considered to consist of nodes (corresponding to 
switching centers) and links connecting them (corresponding to direct 
trunk groups), then each link may be assigned an originating traffic, a 
trunk group size and an alternate routing pattern. In addition, calls 
which overflow the direct route and are to be alternate routed may be 
assigned a directionality, or originating node, which allows one of two 
alternate routing configurations to be hunted over, depending upon the 
direction of the call. Every trunk group is a “two way” group, so no 
direction need be assigned to calls which are carried on the direct route. 

The simulator will accept systems which contain as many as 63 
switching points, each of which may be connected to any other switching 
point by up to 511 trunks. Calls which do not find an available trunk in 
the direct route may overflow through one of two sets of up to 63 specified 
routes, depending upon the direction of the call. Each alternate route 
may contain as many as 7 links, which implies switching through up to 
6 intermediate nodes. (A modification of the program allows the alternate 
routes to be chosen on a ‘‘step by step” basis, where the first node in 
the alternate route chain is specified, and the call proceeds from node 
to node according to the alternate routing specification at the last node 
through which the call was switched. ‘“Ring-around-the-Rosy” and 
‘“Shuttling”’ are prevented by keeping records of where the call has al- 
ready been switched and not allowing it to use the same node twice.) It 
should be emphasized that the program as described here requires that 
the entire alternate route be specified at the originating link, and failure 
to connect on any link of an alternate route allows an entirely new route 
to be selected. 

An over-all maximum size of the system, set by the limitations of the 
computer memory, is 


(13 X Number of links) + Total number of trunks 
+ Total number of alternate routes = 22,013. 


Tor example, if a system has 40 nodes, (and therefore 780 links), and if 
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there is a total of 83900 trunks in the system, (corresponding to an average 
of five trunks per link), then 7973 alternate routes, or about 10 per link, 
can be specified. This is a rather large system, and in fact the simulator 
allows for experimentation with a wide range of possible trunk, node, 
and alternate route configurations. 

In order to estimate the reliability of the simulation results, outputs 
may be printed out at subintervals of the run. Furthermore, since the 
system starts empty, equilibrium may be attained before records are 
kept by running the program for a number of subintervals and discarding 
their records. The number of subintervals to be printed out, as well as 
the number to be discarded, can be specified in the input, along with the 
average holding time per call, the total time the simulation should be 
run in holding times, and indications as to what sort of alternate routing 
scheme is to be used. The holding times of all calls are assumed to be 
exponentially distributed with identical means, and traffic levels are 
varied by altering the average time between calls offered to each link. 
Pseudo-random numbers to specify the input are generated by a multi- 
plicative congruential technique which gives a cycle of 2° numbers 
before a repeat. The random number generator is not cleared after every 
simulation, so that if several experiments are made successively, they 
will not utilize the same sequence of random numbers. Thus runs can be 
repeated identically if desired, or, alternatively, a different set of random 
numbers can be used for the same system configuration by the simple 
expedient of reordering the experiments. 

The information which is printed out, in addition to that derived . 
directly from the inputs, (number of nodes, number of trunks and loads 
per link, alternate routing patterns, etc.) is as follows: 

(1) An estimate of the probability of loss (blocking) from each link; i.c., 
the proportion of calls directly offered to a specific link which are 
unable to be served at all. 

(2) An estimate of the probability of direct overflow; i.e., the propor- 
tion of calls which overflow the direct route, although they may 
be served on an alternate route. 

(3) Number of calls offered to each link, both directly and as an 
alternate route. 

(4) Load in erlangs carried by each link, both from direct and alternate 
routed traffic. 

(5) Calls carried by each link, both from direct and alternate routed 
traffic. 


(6) Over-all average blocking; i.e., >> ayp;/ >> a; where aj is the load 
i=1 i=l 
offered to link 7, and p; is the blocking experienced by a; . 
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(7) Total carried load, which is really total calls multiplied by holding 


times. That is, a call is assumed to provide the number of erlangs 
its holding time would represent, regardless of how many links 
are used. This quantity is derived indirectly, by multiplying the 
total offered load by one minus the overall average blocking. 


Yor moderate sized systems this program will process calls at the rate 
of about 1,200,000 calls per hour. A sample output is shown in Fig. 1. 


NETWERK SINULATION 


RESULTS F@R FIRST 5 5 THS 


NUMBER BF N@DES AVERAGE H2LDING TIME ALTERNATE REUTING PLAN NUMBER 6F H@LO TIMES INTTIALIZING SUBGRZUPS 
5 i 


fu wa 


4 


1000.00 1 200. 


LINK @FFERD NUM Lass PR CIR C BFF A RFF T RFF OTRECT ALTERN TETAL O CAR A CAR T CAR 
NUR LBA0 TKS PROS OVRFL@ CALLS CALLS CALLS CAR LO CAR LO CAR LO CALLS CALLS CALLS 


12 6.750 3 0.2336 0.7045 1841 0 1841 2.66 OO. 2266 544 0 844 
13 61.250 57 0.2889 0.2889 12668 585i 18459 43.59 lel? 54.76 8994 2326 11320 
14 26.250 15 0.3330 0.5645 $442 2685 8107 11-57 2.79 14636 2370 €16 2986 
15 17.500 9 O.2777 0.5836 3475 985 4460 7.26 1.10 8.36 1047 268 1695 
2.3 35.000 19 0.2837 0.5342 7273 1030 8303 16.54 1662 18616 33868 297 3685 
24 43.750 25 0.3478 0.5479 8829 4348 13177 19667 4.62 26.29 3992 965 4957 
25 70.000 64 C.3395 0.3395 14454 9256 23710 46.58 15.55 62013 9547 3193 12740 
34 18.750 75 0.3714 0.3714 16409 14051 30660 50.97 22-462 13.39 10315 4548 14863 
35 52.500 31 0.3514 0.5482 10598 $725 16323 23.61 6658 30-20 4788 1389 6177 
45 87.500 81 0.3583 0.3583 18C43 15297 33340 56.17 23623 79.41 21578 4705 16283 

2VERALL AVERAGE BLECKING = 0.335161 
TETAL CARRIED LOAD © 319.95 
ALTERNATE REUTE PATTERN 
LINK NUMBER =k 2y FIRST OIRECTIEN TRAFFIC 50 PER CENT 

0 0c oo 4 ¢ 000 0 4 50000 300000 350000 

4 0 0 0 0 345000 

0 0 000 400000 340000 5 0000 0 3 5 0000 

5 0004 3 4 $ 0 0 0 

LINK NUMBER 1 36 FIRST DIRECTIGN TRAFFIC 0 PER CENT 

LINK NUMBER 4 FIRST DIRECTI@N TRAFFIC 100 PER CENT 
0 0 0 0 0 

LINK NUMBER of Se FIRST DIRECTIEN TRAFFIC SC PER CENT 
0 06 00 0 300000 3400006 
00000 4 00000 3 40000 

LINK NUMBER 2. 3) FIRST DIRECTIEN TRAFFIC 50 PER CENT 
0 60 0 0 $ 00000 5 4 0000 
00000 4 0 0000 5 40000 


LINK NUMBER 2 4e FIRST DIRECTIZN TRAFFIC 100 PER CENT 


LINK NUMBER 2 Se FIRST QIRECTIGN TRAFFIC GO PER CENT 


LINK NUMBER 3) 4s FIRST CIRECTIGN TRAFFIC C PER CENT 


LINK NUMBER 32) Se FERST OIRECTION TRAFFIC 1LOQ PER CENT 
6 6 0 0 9 


LINK NUMBER 4 


uw 


« FIRST OLRECTION TRAFFIC O PER CENT 


Fig. 1 — Sample computer output. 
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Clearly, the flexible nodal structure upon which the program is based 
allows certain things to be done by expending extra nodes which are not 
directly programmed. For example, if it is desired to have two one-way 
groups in a link, it is necessary merely to assign no trunks to the direct 
link, and have each of the alternate route patterns contain one node, 
which is assigned for the purpose. So, a single link would then have 4 
nodes and 4 links with trunks assigned, as shown below. 


1 AND 2 ARE 
ORIGINAL NODES 


3 AND 4 ARE 
SUPPLEMENTAL NODES 





Another possible use is the simulation of progressive graded multiples. 
Suppose it is desired to simulate the simple multiple shown below. 


faa ee 


a; da 


This is clearly equivalent in terms of loads eared and blocking to the 
following nodal structure. 





In this analogue, a; has links 1-4 and 3-4 as an alternate route, while a2 
overflows through 2-4 and 3-4. Thus, if links 1-4 and 2-4 are provided 
with more trunks than 3-4, they can introduce no blocking and the 
system corresponds to the graded multiple above where link 3-4 is 
equivalent to the common group. This sort of flexible structure, then, 
appears to be useful in many ways, and may in fact come to have applica- 
tion beyond its original intent. 

This program was primarily designed as a tool for the evaluation of 
alternate routing networks and as an aid in formulating principles for 
their design and administration, although one of the purposes was to 
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assist in the solution of real problems as they arose. Accordingly, studies 
have been begun with the aim of codifying classes of networks and deter- 
mining the significant parameters, advantages and disadvantages of each. 


III. ANALYSIS OF NETWORKS 


As a first step in the study of alternate routing communications net- 
works it is desired to compare the behavior of several alternative con- 
figurations under normal and overload conditions. The variables which 
seem most likely to be significant in determining network performance 
are: 

(1) Number of switching points in the network. 

(2) Overall size of the network, perhaps best described as a measure 

of the average load or number of trunks per link. 

(3) Alternate routing procedure used. Thus, a system which allows all 
traffic to overflow in some specified manner will probably perform 
differently than one which considers some routes to be “high 
usage” and from which traffic is alternate routed, and others to be 
“finals,” from which no alternate routing is permitted. 

(4) Type of overload encountered. A uniform (system wide) overload, 
for example, may cause a behavior quite different from an over- 
load which is confined to a particular portion of the network. 

In order to estimate the performance of networks when these parameters 
vary, and yet keep the results simple enough so that they can be easily 
interpreted, eight different networks were studied, each having two 
different alternate routing plans. In each case both uniform and non- 
uniform overloads were considered. 

The eight networks studied consisted of two networks with three, 
two with four, two with five, and two with six nodes. The loads were ad- 
justed so that the average load per link varied from three and one half 
erlangs per link in the most lightly loaded network to about 28 erlangs 
per link in the most heavily loaded configuration. 

For purposes of convenience, the following terminology will subse- 
quently be used: 

(1) A link is a connection between two nodes, which may have any 

number of trunks, including zero. 

(2) A node is a switching center, characterized by two or more links 
terminating at it. 

(3) If network A is larger than network B, it has more nodes. 

(4) If network A is heavier than network B, it has more offered erlangs 
per link on the average. 
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(5) A hierarchical alternate route network is one in which at least 
some of the trunk groups are high usage; i.e., traffic which is not 
carried can be overflowed to other groups, at least some of which 
are finals, which have no alternate routes. 

(6) A symmetrical alternate route network has only high usage groups. 

(7) A simple network has only final routes; i.c., no alternate routing 
is allowed. 

The procedure which was followed in all cases was to postulate loads 
offered to each link in the network. For the sake of generality, these 
loads are ordinarily unequal, although in some cases equal loads are 
used in places where it is thought that this will not prejudice the results. 
Kach network was engineered for a loss of 1 per cent on the worst link 
for simple, symmetrical and hierarchical networks. Loads were then 
applied corresponding to (a) 25 per cent, 50 per cent, 75 per cent, and 
100 per cent uniform overload and (b) 25 per cent, 50 per cent, 75 per 
cent, and 100 per cent overload on all routes terminating at node 1. 
The selection of node 1, of course, is quite arbitrary, but this choice 
appears to be immaterial in the symmetrical case, and is likely to be 
most typical for hierarchical networks. (The heaviest loads in the 
hierarchical networks were reserved for the final routes, since this will 
allow most effective use of the hierarchy and seems to correspond to 
actual practice.) The simulations ran for 200 holding times for heavy 
networks and 1000 holding times for light networks, with an additional 
20 per cent of this time (i.e., 40 or 200 holding times) discarded at the 
beginning of each run to remove the initial transient. Results were 
printed out at five subintervals of the total run, and examined to deter- 
mine that the initial transient had been removed and the run was long 
enough to yield results sufficiently accurate for the purposes of this study. 

Sketches of the networks are shown in Figs. 2 to 9 along with tables 
indicating the link loads and trunks assigned for each of the alternate 
routing doctrines. (Two sketches of each network are provided, one of 
which can be easily related to a symmetrical alternate routing philosophy 
while the other suggests a hierarchical doctrine. The dashed lines in the 
hierarchical sketch denote high usage groups, while the solid lines repre- 
sent final routes. Since all links are high usage in the symmetrical system 
this distinction is not needed, and identical solid lines were used through- 
out.) The simple networks were engineered using the Erlang B tables, 
while the hierarchical networks were engineered using conventional 
methods with the sort of hierararchy used in the Bell System, allowing 
about 0.7 erlangs (25 ccs) on the last trunk in a high usage group. The 
(hierarchical) configurations were then checked experimentally using the 
simulator, and adjustments were made where required. The symmetrical 
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; Engineered Trunks 
Link Number Breinecred ees 
Simple Symmetrical Hierarchical 
1-2 2 7 6 1 
1-3 4 10 8 13 
2-3 6 13 11 16 
Total 12 30 25 30 
Ave/Link 4 10 8.35 10 
3 
1 2 
HIERARCHICAL SYMMETRICAL 


_ Fig. 2— Three-node network models with table of link loads and trunk as- 
signments for light loading. 
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F Engineered Trunks 
Link Number sca Paka 
Simple Symmetrical Hierarchical 
1-2 5 11 ll 4 
1-3 10 18 16 21 
2-3 15 24 21 27 
Total 30 53 48 52 
Ave/Link 10 17.65 16 17.35 
3 
1 2 
HIERARCHICAL SYMMETRICAL 


Fig. 3 — Three-node network models with table of link loads and trunk assign- 
ments for heavy loading. 
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Link Number Engineered aes 
Simple Symmetrical Hierarchical 

J-2 1 5 4 0 
1-3 4 10 8 13 
1-4 2 7 5 2 
2-3 3 8 6 3 
2-4 5 11 9 15 
3-4 6 13 10 17 
Total 21 54 42 50 
Ave/Link 3.5 9 7 8.33 

3 4 

1 2 

HIERARCHICAL SYMMETRICAL 


Fig. 4 — Four-node network models with table of link loads and trunk assign- 
ments for light loading. 
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Link Number es eae 
Simple Symmetrical Hierarchical 

1-2 5 11 12 3 
1-3 20 30 27 37 
1-4 10 18 18 9 
2-3 15 24 22 14 
2-4 25 36 32 43 
3-4 30 42 38 o2 
Total 105 161 149 158 
Ave/Link 17.5 26.8 24.8 26.35 

3 4 

1 2 

HIERARCHICAL SYMMETRICAL 


Fig. 5 — Four-node network models with table of link loads and trunk assign- 
ments for heavy loading. 
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Network PARAMETERS, 5 Nopes — Licutr 
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Link Number Te 
Simple Symmetrical Hierarchical 
1-2 1 5 4 0 
1-3 5 11 9 17 
1-4 2 7 5 1 
1-5 2 7 5 1 
2-3 3 8 6 2 
2-4 4 10 7 3 
2-5 6 13 10 19 
3-4 6 13 10 21 
3-5 4 10 7 4 
4-5 7 14 11 23 
Total 40 98 74 91 
Ave/Link 4 9.8 7.4 9.1 
5 
3 4 
1 2 
HIERARCHICAL SYMMETRICAL 


Fig. 6 — Five-node network models with table of link loads and trunk assign- 
ments for light loading. 


networks were designed to allow each parcel of traffic to overflow through 
all other nodes in turn and were engineered entirely with the simulator by 
trial and error. A first estimate of trunk quantities was made using a 
fixed differential between the load in erlangs and the number of trunks 
in each link, and corrections were then made as required. 

Having established this framework, or procedure for evaluation, a 
critical question is, What criteria can be used to compare the perform- 
ance of various types of networks? It is desirable to take account of the 
efficiency (carried load per dollar of investment) of the network at all 
times, as well as the grades of service which are provided to each group 
of customers. Although grade of service here can no longer be interpreted 
as the small percentage of blocked calls that is ordinarily encountered at 
normal engineered loads, it is nevertheless incumbent upon the network 
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Network PARAMETERS, 5 Nopes — Heavy 
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Link Number Pog ee 
Simple Symmetrical Hierarchical 
1-2 5 11 13 3 
1-3 35 47 43 57 
1-4 15 24 23 15 
1-5 10 18 18 9 
2-3 20 30 28 19 
2-4 25 36 33 25 
2-5 40 53 48 64 
3-4 45 58 53 75 
3-5 30 42 38 31 
4-5 50 64 58 81 
Total 275 383 355 379 
Ave/Link 27.5 38.3 35.5 37.9 
5 
3 4 
1 2 
HIERARCHICAL SYMMETRICAL. 


Tig. 7 — Five-node network models with table of link loads and trunk assign- 
ments for heavy loading. 


designer to consider the extent to which service is degraded on any partic- 
war link. Similarly, one would expect the efficiency under overload con- 
ditions to be higher than that encountered during normal operation, but 
the relative efficiencies of networks using various alternate routing doc- 
trines (to carry the same loads) may be rather different. It is clear, for 
example, that if a call uses a trunk in each of two links, there is a possi- 
bility of lower network efficiency being obtained than if it used a trunk 
in only one link. It is one of the purposes of this study to determine at 
what overload point such a loss in efficiency takes pine, and what if any 
remedial action can be taken. 

Thus, two rather different criteria appear to be important, one of 
which is essentially an economic variable, and the other a service vari- 
able. They are both further complicated by the fact that the first de- 
pends on the relative costs of trunks in different links, and the second 
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Network PARAMETERS, 6 Noprs — LIGHT 
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Link Number Bagieo et pnts: 
Simple Symmetrical Hierarchical 

1-2 1 5 4 0 
1-3 5 11 8 16 
1-4 1 5 4 1 
1-5 3 8 7 3 
1-6 2 7 5 2 
2-3 2 7 5 2 
2-4 6 13 9 18 
2-5 3 8 6 3 
2-6 4 10 7 4 
3-4 4 10 ff 4 
3-5 6 13 9 21 
3-6 4 10 7 4 
4-5 5 11 8 5 
4-6 7 14 11 23 
5-6 7 14 11 24 
Total 60 146 108 130 
Ave/Link 4 9.75 2 8.67 

5 6 

3 4 
1 2 
HIERARCHICAL SYMMETRICAL 


Fig. 8 — Six-node network models with table of link loads and trunk assign- 
ments for light loading. 


has a different value for every parcel of traffic in the network. In order 
to simplify these complexities and reduce the number of variables which 
enter into the measure of performance, only the worst blocking in the 
network will be considered as the service criterion. This is, of course, 
conservative, and reflects the difficulties which might occur when alter- 
nate routing is canceled and a small parcel of traffic has no trunks in the 
direct path. The blocking on such a parcel would then be unity, and it 
would quickly be noticed that a parcel of traffic is isolated. 

The problem of assigning costs to trunks is more difficult, of course, 
since there is no apparent logical worst or best case. Thus the assump- 
tions made here for the relative costs are quite arbitrary and oversimpli- 
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Link Number ee 
Simple Symmetrical Hierarchical 
1-2 5 11 15 3 
1-8 40 53 48 67 
1-4 10 18 19 9 
1-5 20 30 28 22 
1-6 15 24 24 13 
2-3 10 18 19 9 
2-4 40 53 48 67 
2-5 20 30 28 18 
2-6 25 36 33 27 
3-4 30 42 38 27 
3-5 45 58 53 82 
3-6 30 42 38 30 
4-5 35 47 43 35 
4-6 50 64 58 90 
5-6 50 64 58 97 
Total 425 590 550 596 — 

Ave/Link 28.3 39.4 36.7 39.7 

5 6 

3 4 
: 1 2 
HIERARCHICAL SYMMETRICAL 


Fig. 9 — Six-node network models with table of link loads and trunk assign- 
ments for heavy loading. 


fied, but may still be useful in evaluating network performance. Two 
different assumptions will be made. The first is that all trunks have the 
same cost. This might be a reasonably realistic assumption in a network 
where the designers are likely to think of symmetrical alternate routing 
doctrines. In effect, it states that the distance between any two nodes is 
not sufficiently different from the distance between any other two nodes 
to materially affect the cost of trunking facilities between them. Al- 
though this may appear to represent an unrealistic geographical situa- 
tion, it may not be too far in error if the economics of long haul, large 
cross section trunking facilities are considered. In such systems, the ter- 
minal costs make up a large portion of the total trunk cost, and these, 
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of course, are independent of the length of haul. The second assumption 
is that some routes are half as expensive as the others. For example, in | 
Fig. 4, routes 1-38 and 2-4 are each considered to be half as long as each 
of the other routes in the network, all of which are assumed to be vir- 
tually the same length (all lengths here, of course, refer to costs, which 
are ordinarily roughly proportional to lengths). This assumption is geo- 
graphically reasonable, and is, in fact, the kind of layout that is often 
encountered and which may well have prompted the development of 
hierarchical alternate routing procedures. Although neither of these 
weighting schemes may exactly represent an actual case, using each as- 
sumption in its logical place may yield more realistic comparative results 
than would using the same assumption throughout. 

Having reduced the parameters for evaluation to two (worst blocking 
and load carried per dollar of investment), they can be combined into 
one by the following argument. Both of these parameters, which will be 
called B (blocking) and F (efficiency) generally increase with increasing 
loads (although H may occasionally decrease in a non-simple network). 
A large value of F is generally desirable, but, of course, a large value of B 
is not. In fact, quite the reverse is true, and so a high value of (1 — B) 
is a desirable goal. Furthermore, the two factors will increase under differ- 
ent circumstances. For example, a highly efficient network may readily 
yield a very high value of /, but will also cause very high blocking. Thus 
B will be high and (1 — B) low. Conversely, a loosely engineered net- 
work is likely to provide good service under overloads, yielding a rela- 
tively low B and high (1 — 8B), but in turn be inefficient, with a low E. 
In both of these cases, the product #(1 — B), will take on some inter- 
mediate value. Accordingly, a figure of merit for networks, called the 
Performance Measure, will be defined as M = H(1 — B). This number 
may be dimensioned to lie between zero and one and will pass through 
a maximum as the load is increased. At engineered levels it will es- 
sentially represent the network efficiency, and as the load is increased 
it will indicate when service or efficiency or both are deteriorating. A 
high M is clearly a mark of a well performing system, with efficient trunk 
usage and at least tolerable service, while a low or rapidly decreasing M 
will show a system which is either being inefficiently used or is providing 
poor service or both. If M = 0 an intolerable situation exists; 1.e., either 
some parcel of traffic is unable to be served or no load is being carried 
by the network. If a network can be designed to be efficient under nor- 
mal conditions and to maximize M during moderate or partial overloads, 
and steps can be taken to prevent too rapid a degradation of this quan- 
tity during severe overloads, then it is a reasonable assumption that this 
design will be satisfactory for its purposes. That is, it will provide 
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efficient communications facilities at normal loads, and will also allow 
the continuation of at least tolerable service between all points under 
adverse conditions. 

One more modification was made in the results before analysis was 
undertaken. Since large trunk groups are more efficient than small ones, 
the group size would naturally affect the values of H and M, both at 
normal loads and under overload conditions. Accordingly, H and M were 
then plotted for each network relative to the # and M of the correspond- 
ing simple networks. The simple network was chosen as a convenient 
reference point, since it is easily engineered and requires no compli- 
cated switching equipment for implementation. Thus, one of the strong 
changes in efficiency which is not caused by the alternate routing pat- 
tern is largely removed, permitting comparisons among the latter to be 
more readily made. 


IV. RESULTS 


In order to investigate the effects of various parameters on network 
behavior, the network efficiencies, /, and performance measures, J, 
were calculated for engineered loads and for the various overload con- 
ditions which were tested. The relative values of H and M (relative 
to a simple network designed to carry the same loads) were then plotted 
versus the degree and kind of overload experienced by the network. Thus 
Tig. 11 shows the values of M for symmetrical networks, for loads rang- 
ing from engineered to 100 per cent overload where the overloads occur 
uniformly throughout the network. Fig. 13 exhibits the same informa- 
tion for hierarchical networks. Figs. 12 and 14 show the behavior of M 
with load when the overloads occur only on those links which terminate 
at node 1 with all other loads remaining at engineered levels. Finally, 
Vigs. 15 through 18 are graphs of efficiency (/) versus load for the same 
situations as pertain to Figs. 11 through 14. The points from which the 
(smoothed) curves were plotted are shown in Figs. 12 to 18. They are 
omitted in Fig. 11 for the sake of clarity. 

In order to keep the comparisons between symmetrical and hierarchi- 
cal networks on a somewhat realistic basis, it is necessary to make some 
adjustment for the probable differences in geography which are likely 
to encourage consideration of one or the other type of network. Accord- 
ingly, as was mentioned above, certain trunks were considered to cost 
twice as much as others, which introduced a weighting factor into the 
values of H and hence into M as well. For example, in the 4 node case 
shown in Fig. 4, trunks in links 1-3 and 2-4 were considered to be only 
half as expensive as trunks in the remaining four links in the network. 
This reduced the cost of the trunk plant to 0.805 times the value which 
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would result if all trunks were assumed to be of equal value in the case 
of a simple network, and to 0.720 times its former value for the hierarchi- 
cal case. Thus the relative efficiency of the hierarchical network is in- 
creased by a factor of 0.805/0.720 or 1.119. This sort of adjustment was 
made in all calculations relating to hierarchical networks, while all trunks 
in symmetrical networks were assumed to be of equal value. The weight- 
ing factors obtained for the various networks are tabulated in Fig. 10. 
The symmetrical networks which were studied in detail operated in 
the following fashion. Traffic which was blocked from any link was over- 
flowed to an alternate route consisting of two links in tandem. If the 
call was blocked on this route it was offered to still another two-link 
route, and so on until all such routes were exhausted. No call was per- 
mitted to use a route which required more than two links in tandem. 
Some experiments were made on networks which allowed three tandem 
links to be used, but is was found that they were at best marginally more 
efficient than the two-link maximum network at engineered loads and 
deteriorated much more violently under overloads. Therefore, they are 
not considered further in this paper. The order of selection of alternate 
routes was arranged to approximately equalize the load overflowed to 
each route. Although this is probably not the most efficient arrangement, 
it should be adequate to illustrate the behavior of symmetrical networks. 
The hierarchical networks operated in a manner similar to the Bell 
System toll network, with the difference that whereas in the Bell System 
the routes are selected link by link, in the simulation the routes are en- 
tirely preselected at the originating node. If a network is drawn as shown 
in the hierarchical sketches in Figs. 2 to 9, the route selection is made by 


OVERALL SYSTEM CHARACTERISTICS 
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Case Tea) es een ea Engineered Load me Se ee 

Hier. |Symmet.| Hier. |Symmet.| Hier. | Symmet. 
3 Nodes, light 4 1.194 1.155) 1.014 | 1.156) 1.128 | 1.203} 1.124 
4 Nodes, light 3.5 1.119 1.181) 1.0384 | 1.185) 1.229 | 1.251; 1.184 
5 Nodes, light 4 1.181 1.288) 1.074 | 1.305) 1.305 | 1.3872} 1.210 
6 Nodes, light 4 1.194 1.263) 1.117 | 1.328) 1.363 | 1.324) 1.245 
3 Nodes, heavy 10 1.121 1.076) 1.034 | 1.073) 1.164 | 1.188) 1.180 
4 Nodes, heavy} 17.5 1.064 1.088) 1.040 | 1.121) 1.271 | 1.140) 1.202 
5 Nodes, heavy | 27.5 1.071 1.100} 1.062 | 1.147) 1.809 | 1.151] 1.274 
6 Nodes, heavy | 28.3 1.085 1.134) 1.074 | 1.205) 1.364 | 1.203) 1.322 








Fig. 10 — Over-all system characteristics. 
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hunting up the hierarchy, starting from the terminating office, in the dis- 
tant region, and then down the hierarchy in the home region. (A region 
can be thought of as all centers whose final routes ultimately terminate 
at the same highest level switching center.) Thus, for example, in the 
four node network shown in Fig. 4, the alternate routes for traffic offered 
to link 1-2 are: 

For half the traffic, 1-4-2, 1-3-2, 1-3-4-2; and for the remainder of the 
traffic, 1-3-2, 1-4-2, 1-3-4-2. 


4.1 General Observations 


In order to draw conclusions from this study as to the relative merits 
of various types of alternate routing systems under different load condi- 
tions, Figs. 10 to 18 will be examined and the significance of the results 
discussed. 

As a very first step, a cursory examination of all figures reveals the 
following: 

(1) The relative effectiveness of alternate routing networks, whether 
measured by /# or M, tends to decrease with overload, with the 
decrease occurring more rapidly under uniform than under local 
overload. Although in some cases the network remains superior to 
a simple network even for 100 per cent overload, the relative per- 
formance at such overloads is almost always poorer than at en- 
gineered loads. This is due, of course, to the fact that the average 
number of links per call increases with overload, causing a de- 
crease in efficiency which may outweigh the gains yielded by the 
larger effective access provided by the alternate route system. 
(See Fig. 10.) 

(2) Light networks (those with less traffic), gain more from alternate 
routing than do heavy networks. This seems to occur because 
systems designed for large parcels of traffic use large efficient 
groups. Thus providing alternate routes in heavy networks, which 
increases the effective access somewhat, does not materially 
increase the efficiency, while the degradation caused by using 
several links per call is nevertheless present. In lighter networks, 
the increase in efficiency owing to the larger effective access is 
substantial, overriding the degradation and causing a considerable 
gain in effectiveness. 

Perhaps to this list should be added: 

(3) As mentioned above, symmetrical systems do not appear to 
benefit from allowing more than two links in tandem to be used 
by any call. This effect is apparently caused by the decrease in 
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efficiency which results from using many links per call overriding 
the gain yielded by increased access. In this situation, of course, 
the increase in link occupancy may be substantial, while the 
increase in effective access is likely to be small. 
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Fig. 11 — Relative performance measure of symmetrical networks under uni- 


form overloads. - 
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Fig. 12 — Relative performance measure of symmetrical networks under local 
overloads. 


4.2 Symmetrical Networks 


The curves shown in Figs. 11 through 18, when studied closely, reveal 
“much information regarding the characteristics of the networks con- 
sidered. In Fig. 11, the high relative performance measure of light 
symmetrical networks at engineered loads and the rapid decline as the 
load is increased uniformly is clearly indicated. The heavier networks 
exhibit a lower relative value of M at engineered loads and also decline 
rapidly, bringing their performance measure down to very low relative 
values at high overloads. Such a rapid decrease in M, it would appear, 
would make it impracticable to install symmetrical systems in many 
actual applications, were it not for the fact that / can be kept relatively 
high by canceling alternate routing at some appropriate point. The 
dotted lines in Fig. 11 indicate the relative performance measure if 
alternate routing is canceled, and it is clear that this factor can be kept 
above 0.9, regardless of the size of the network and even for 100 per cent 
overload. In any event, it does appear that for extremely heavy net- 
works the decline is so precipitous that this method of alternate routing 
might well prove to be inapplicable. Fig. 12, however, illustrates the real 
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Fig. 13 — Relative performance measure of hierarchical networks under uni- 
form overloads. 


strength of the symmetrical routing doctrine. The relative performance 
measure is shown to be almost constant under local overloads, and 
remains above unity for all but the largest, heaviest networks. Further- 
more, this sort of alternate routing structure is likely to be quite efficient 
at engineered loads in systems where call setup time and switching 
delays are no longer negligible, since it generally uses a relatively small 
number of links per call, as evidenced in Fig. 10. 

A symmetrical network structure then can be devised which has the 
following characteristics: 

(1) Performance measure (and thus efficiency) are high at engineered 

loads. 
(2) Local overloads are well tolerated, with the network remaining 
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Fig. 14 — Relative performance measure of hierarchical networks under local 
overloads. 


efficient and not allowing any parcel of traffic to suffer excessive 
blocking. 

(3) If alternate routing can be canceled at the appropriate point, 
then the performance measure can be maintained at a tolerable 
level even under severe uniform overloads. 

(4) The average number of links per call is quite low at engineered 
loads, increasing rapidly as overloads are applied. 

An important practical question in (3), however, is whether a net- 
work control can be devised to cancel alternate routing easily, and how 
the control can determine the degree of overload. Another disadvantage 
of such networks is the unavailability, at present, of any but the very 
crudest methods of trunk engineering. However, this type of network is, 
in-principle, capable of satisfying the four points listed above, all of 
which are desirable and often are difficult to attain concurrently. 
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Fig. 15 — Relative efficiency of symmetrical networks under uniform over- 
loads. 


4.3 Hierarchical Networks 


In Fig. 13, the relative performance measure of hierarchical networks 
under uniform overloads is shown. A comparison with Fig. 11 indicates 
that the relative M is higher at engineered loads for symmetrical than 
for hierarchical light networks and not too different for heavy networks, 
although the decline with uniform overload is more rapid in the former 
case. In the hierarchical system, however, the relative M cannot be 
increased by complete cancellation of alternate routing, since this in- 
creases the blocking on some parcels of traffic which are offered to high 
usage groups to a high level. ig. 13 then shows that, although light 
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Tig. 16 — Relative efficiency of symmetrical networks under local overloads. 
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Fig. 17 — Relative efficiency of hierarchical networks under uniform over- 
loads. 
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Fig. 18 -— Relative efficiency of hierarchical networks under local overloads. 


networks retain their effectiveness up to 100 per cent overload, large 
heavy networks show a decline with uniform overloads to a quite low 
value of relative M. 

The behavior of such networks under local overloads is shown in Fig. 
14. In these circumstances the relative performance measure declines 
slowly from the value at engineered load as the local overload is in- 
creased. The decline is sufficiently gradual to enable the lighter networks 
to retain an M greater than unity for all overloads considered. The 
heavier networks, however, are unable to do this, and the value of 
relative M for the worst network declines almost to 0.8 for the greatest 
local overload. 

The essential operating characteristics of networks of this basic design 
then appear to be as follows: 

(1) The performance measure (and thus efficiency) tend to be high 
at engineered loads (if the variation in trunk costs is taken into 
account). 

(2) The performance measure declines at a moderate rate under 
uniform overload, reaching rather low values for large, heavy 
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networks. No simple corrective measures are available to improve 
the situation but no special measures are needed to prevent 
catastrophic performance degradation. (Certain more complicated 
corrective measures, such as selective cancellation of alternate 
routing, might prove effective, but this sort of procedure was not 
studied.) 

(8) Local overloads are moderately well tolerated, with the relative 
performance measure showing a gradual decline with increasing 
load, and dipping below unity for some cases. 

(4) A relatively large number of links are used per call at engineered 
loads, and this number increases gradually with overloads. 

This is then essentially a moderately well behaving network, providing 
neither superlative nor intolerable service at any level of load. It requires 
no complex controls to keep operating reasonably well, and is relatively 
simple to implement without the need for sophisticated switching equip- 
ment at the tandem points. In a real system, with switching delays and 
appreciable call setup time, however, this type of network may behave 
badly under overloads, since some calls use many links in tandem, and 
therefore can tie up a great deal of equipment when processing a call, 
even though the call is not completed. In fact, the large number of links 
used per call in hierarchical systems even at engineered loads is a source 
of inefficiency in such systems. 

An apparent peculiarity in all the curves is the superiority of large 
light networks over small light networks at engineered loads, with the 
situation reversing as the load increases, so that at 100 per cent overload 
the small light networks are generally superior. A qualitative explanation 
of this would again involve the average number of links per call, which 
increases more rapidly in large networks than in small ones. The heavy 
networks do not exhibit this effect at all, and the larger heavy networks 
always appear to perform less well than the smaller ones. Since the larger 
(heavy) networks were designed to be more heavily loaded than the 
smaller ones, however, (see Fig. 10), this effect is more likely to be a 
result of network load than size. 


4.4 Efficiency Curves 


Figs. 15 to 18 portray the network efficiencies in the several cases 
studied. In general, these curves display a somewhat shallower slope 
than the corresponding curves for M. This implies that as the load is 
increased, not only does the relative network efficiency decrease, but the 
blocking encountered by the most poorly served group of customers also 
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increases more rapidly when alternate routing is in effect than when it is 
not. The only exception to this is the symmetrical system under local 
overload (Figs. 12 and 16). In this case the relative efficiency and the 
relative performance measure decline at about the same rate, which 
implies that in this system the blocking remains at essentially the same 
level whether a symmetrical or simple system is in use. This is an im- 
portant consideration in favor of symmetrical networks, particularly 
since both efficiency and performance measure remain reasonably high 
for all types of overloads considered. 


V. CONCLUSIONS 


The foregoing discussion of various types of alternate routing net- 
works may be of use in determining whether alternate routing structures 
should be incorporated into particular switching systems and, if so, of 
what sort they should be. Many of these factors have long been known 
and used by network designers, and the present study should provide 
additional documentation. In the case of factors not previously con- 
sidered, this study may provide justification for their incorporation into 
future designs. Some of these are as follows: 

(1) If the overload capability of the system is not important, some 
sort of alternate routing system is almost certainly justified on 
economic grounds. 

(2) If local overload capability is important, then strong considera- 
tion should be given to a symmetrical alternate routing network, 
since this configuration allows the blocking to be kept to a mini- 
mum. under local overloads while retaining a high network effi- 
clency. 

(3) If uniform overload capability is an important consideration, 
then alternate routing structures should be contemplated with 
caution, but can still be used if the average load per link is small 
and appropriate action, such as cancellation of alternate routing 
(either uniformly or selectively) can be taken as required. 

(4) If the average load per link is small, alternate routing almost 
always is advantageous, while if it is large, the advantage is some- 
times questionable. 

(5) If the initial efficiency is an important criterion, then the selection 
of the type of alternate routing may well depend upon the geog- 
raphy of the particular system. Thus, in certain situations, where 
small towns communicate primarily with nearby cities, a hierarchi- 
cal structure may be preferable, while if there is a large group of 
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approximately equal-sized cities spread over the country, then a 
symmetrical system could prove to be superior. 

If switching equipment is expensive or call setup time is long, 
symmetrical networks may prove to be superior to hierarchical 
structures at engineered loads regardless of the geography. This 
would come about because of the large number of links per call, 
and hence the large amount of switching equipment used by 
hierarchical networks. Clearly, long setup time in this case would 
lead to inefficient trunk usage, since trunks in one link would be 
held while the call progressed along a multi-link path. 

(7) Although not shown specifically in these results, a multi-alternate 
route structure provides service protection, which a simple layout 
does not, and a well connected symmetrical network is likely to be 
less vulnerable to damage than a hierarchical system. 

Most actual systems, of course, must be designed to be efficient at 
engineered loads, and yet must also be able to accept either uniform 
or local overloads without excessive degradation of service. Furthermore, 
real networks usually serve many small towns communicating primarily 
with larger cities, which in turn communicate with each other on a 
roughly equal basis. Therefore, the network designer must decide which 
of these often conflicting criteria are most important, and develop a 
system which satisfies these as closely as it can within the limitations 
imposed by the switching and signaling equipment and the available 
methods of trunk engineering. It is quite likely that the best system in 
most situations is some combination of symmetrical and hierarchical 
networks, not necessarily of the particular kinds studied here. F'urther- 
more, the advent of electronic switching systems and high speed signaling 
devices has made alternate routing doctrines which are dependent on the 
state of the system feasible, and these may well prove to be superior than 
any system with a completely prespecified alternate routing structure. 
However, an analysis of the basic characteristics of simpler networks is 
likely to be useful in predicting the behavior and influencing the design 
of specific, more complex systems. It was this potential application 
which motivated the studies described in this paper. 
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